lingo.lol is one of the many independent Mastodon servers you can use to participate in the fediverse.
A place for linguists, philologists, and other lovers of languages.

Server stats:

59
active users

#embeddings

0 posts0 participants0 posts today
Svenja Guhr<p>Using <a href="https://fedihum.org/tags/doc2vec" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>doc2vec</span></a> <a href="https://fedihum.org/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a> and <a href="https://fedihum.org/tags/TextSimilarity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TextSimilarity</span></a>, she analyzes how canonized works influence later literary production while highlighting texts that have been forgotten, marginalized, or overlooked.<br><a href="https://fedihum.org/tags/Canon" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Canon</span></a> vs. <a href="https://fedihum.org/tags/Counter" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Counter</span></a>-canon becomes a scale shaped by retrospective, cultural, and temporal markers. <a href="https://fedihum.org/tags/CLS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CLS</span></a></p>
Mathieu Jacomy<p>Ah, my latest tool, just out of the oven! Just in time for my Summer break... It's called *Vandolie*. It's for high school students, but it may work for you as well. I will let you discover it by yourself.</p><p>👉 <a href="https://jacomyma.github.io/vandolie/en/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">jacomyma.github.io/vandolie/en/</span><span class="invisible"></span></a></p><p>It's like a mini CorTexT for teenagers, if you know that tool. But it runs entirely in the browser.</p><p>Entirely localized in Danish.</p><p>Consider it a beta version. Usable, but feel free to file GitHub issues for feedback &amp; bugs.</p><p><a href="https://mas.to/tags/CSSH" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CSSH</span></a> <a href="https://mas.to/tags/DistantReading" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DistantReading</span></a> <a href="https://mas.to/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a></p>
➴➴➴Æ🜔Ɲ.Ƈꭚ⍴𝔥єɼ👩🏻‍💻<p>Okay, Back of the napkin math:<br> - There are probably 100 million sites and 1.5 billion pages worth indexing in a <a href="https://lgbtqia.space/tags/search" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>search</span></a> engine<br> - It takes about 1TB to <a href="https://lgbtqia.space/tags/index" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>index</span></a> 30 million pages.<br> - We only care about text on a page.</p><p>I define a page as worth indexing if:<br> - It is not a FAANG site<br> - It has at least one referrer (no DD Web)<br> - It's active</p><p>So, this means we need 40TB of fast data to make a good index for the internet. That's not "runs locally" sized, but it is nonprofit sized.</p><p>My size assumptions are basically as follows:<br> - <a href="https://lgbtqia.space/tags/URL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>URL</span></a><br> - <a href="https://lgbtqia.space/tags/TFIDF" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TFIDF</span></a> information<br> - Text <a href="https://lgbtqia.space/tags/Embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Embeddings</span></a><br> - Snippet </p><p>We can store an index for 30kb. So, for 40TB we can store an full internet index. That's about $500 in storage.</p><p>Access time becomes a problem. TFIDF for the whole internet can easily fit in ram. Even with <a href="https://lgbtqia.space/tags/quantized" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>quantized</span></a> embeddings, you can only fit 2 million per GB in ram. </p><p>Assuming you had enough RAM it could be fast: TF-IDF to get 100 million candidated, <a href="https://lgbtqia.space/tags/FAISS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FAISS</span></a> to sort those, load snippets dynamically, potentially modify rank by referers etc.</p><p>6 128 MG <a href="https://lgbtqia.space/tags/Framework" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Framework</span></a> <a href="https://lgbtqia.space/tags/desktops" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>desktops</span></a> each with 5tb HDs (plus one raspberry pi to sort the final condidates from the six machines) is enough to replace <a href="https://lgbtqia.space/tags/Google" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Google</span></a>. That's about $15k. </p><p>In two to three years this will be doable on a single machine for around $3k.</p><p>By the end of the decade it should be able to be run as an app on a powerful desktop</p><p>Three years after that it can run on a <a href="https://lgbtqia.space/tags/laptop" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>laptop</span></a>.</p><p>Three years after that it can run on a <a href="https://lgbtqia.space/tags/cellphone" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cellphone</span></a>.</p><p>By #2040 it's a background process on your cellphone.</p>
JMLR<p>'Variance-Aware Estimation of Kernel Mean Embedding', by Geoffrey Wolfer, Pierre Alquier.</p><p><a href="http://jmlr.org/papers/v26/23-0161.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">http://</span><span class="ellipsis">jmlr.org/papers/v26/23-0161.ht</span><span class="invisible">ml</span></a> <br> <br><a href="https://sigmoid.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a> <a href="https://sigmoid.social/tags/embedding" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embedding</span></a> <a href="https://sigmoid.social/tags/empirical" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>empirical</span></a></p>
FIZ ISE Research Group<p>We are very happy that our colleage <span class="h-card" translate="no"><a href="https://sigmoid.social/@GenAsefa" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>GenAsefa</span></a></span> has contributed the chapter on "Neurosymbolic Methods for Dynamic Knowledge Graphs" for the newly published Handbook on Neurosymbolic AI and Knowledge Graphs together with Mehwish Alam and Pierre-Henri Paris.</p><p>Handbook: <a href="https://ebooks.iospress.nl/doi/10.3233/FAIA400" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">ebooks.iospress.nl/doi/10.3233</span><span class="invisible">/FAIA400</span></a><br>our own chapter on arxive: <a href="https://arxiv.org/abs/2409.04572" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/abs/2409.04572</span><span class="invisible"></span></a></p><p><a href="https://sigmoid.social/tags/neurosymbolicAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>neurosymbolicAI</span></a> <a href="https://sigmoid.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://sigmoid.social/tags/generativeAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generativeAI</span></a> <a href="https://sigmoid.social/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://sigmoid.social/tags/knowledgegraphs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>knowledgegraphs</span></a> <a href="https://sigmoid.social/tags/semanticweb" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>semanticweb</span></a> <a href="https://sigmoid.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a> <a href="https://sigmoid.social/tags/graphembeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>graphembeddings</span></a></p>
FIZ ISE Research Group<p>Poster from our colleague <span class="h-card" translate="no"><a href="https://blog.epoz.org/" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>epoz</span></a></span> from UGent-IMEC Linked Data &amp; Solid course. "Exploding Mittens - Getting to grips with huge SKOS datasets" on semantic embeddings enhanced SPARQL queries for ICONCLASS data.<br>Congrats for the 'best poster' award ;-) </p><p>poster: <a href="https://zenodo.org/records/14887544" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">zenodo.org/records/14887544</span><span class="invisible"></span></a><br>iconclass on GitHub: <a href="https://github.com/iconclass" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">github.com/iconclass</span><span class="invisible"></span></a></p><p><a href="https://sigmoid.social/tags/rdf2vec" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>rdf2vec</span></a> <a href="https://sigmoid.social/tags/bert" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>bert</span></a> <a href="https://sigmoid.social/tags/llm" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llm</span></a> <a href="https://sigmoid.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a> <a href="https://sigmoid.social/tags/iconclass" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>iconclass</span></a> <a href="https://sigmoid.social/tags/semanticweb" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>semanticweb</span></a> <a href="https://sigmoid.social/tags/lod" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>lod</span></a> <a href="https://sigmoid.social/tags/linkeddata" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>linkeddata</span></a> <a href="https://sigmoid.social/tags/knowledgegraphs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>knowledgegraphs</span></a> <a href="https://sigmoid.social/tags/dh" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dh</span></a> <span class="h-card" translate="no"><a href="https://nfdi.social/@nfdi4culture" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>nfdi4culture</span></a></span> <span class="h-card" translate="no"><a href="https://wisskomm.social/@fiz_karlsruhe" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>fiz_karlsruhe</span></a></span> <a href="https://sigmoid.social/tags/iconclass" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>iconclass</span></a></p>
PyData Madrid<p>Tenemos cita el 20 de febrero 🔥 Nos vemos en BBVA AI Factory para hablar embeddings para contratación financiera y de redes neuronales de grafos. Estamos probando <a class="mention" href="https://bsky.app/profile/guild.host" rel="nofollow noopener" target="_blank">@guild.host</a>, ¡reserva tu plaza aquí! 👇 <a href="https://guild.host/events/embeddings-para-contratacin-306046050" rel="nofollow noopener" target="_blank">guild.host/events/embed...</a> <a class="hashtag" href="https://bsky.app/search?q=%23PyData" rel="nofollow noopener" target="_blank">#PyData</a> <a class="hashtag" href="https://bsky.app/search?q=%23PyDataMadrid" rel="nofollow noopener" target="_blank">#PyDataMadrid</a> <a class="hashtag" href="https://bsky.app/search?q=%23python" rel="nofollow noopener" target="_blank">#python</a> <a class="hashtag" href="https://bsky.app/search?q=%23embeddings" rel="nofollow noopener" target="_blank">#embeddings</a> <a class="hashtag" href="https://bsky.app/search?q=%23GraphNeuralNetworks" rel="nofollow noopener" target="_blank">#GraphNeuralNetworks</a><br><br><a href="https://guild.host/events/embeddings-para-contratacin-306046050" rel="nofollow noopener" target="_blank">📄 Embeddings para contratación...</a></p>
Harald Sack<p>In 2013, Mikolov et al. (from Google) published word2vec, a neural network based framework to learn distributed representations of words as dense vectors in continuous space, aka word embeddings.</p><p>T. Mikolov et al. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781<br><a href="https://arxiv.org/abs/1301.3781" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/abs/1301.3781</span><span class="invisible"></span></a></p><p><a href="https://sigmoid.social/tags/HistoryOfAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HistoryOfAI</span></a> <a href="https://sigmoid.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://sigmoid.social/tags/ise2024" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ise2024</span></a> <a href="https://sigmoid.social/tags/lecture" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>lecture</span></a> <a href="https://sigmoid.social/tags/distributionalsemantics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>distributionalsemantics</span></a> <a href="https://sigmoid.social/tags/wordembeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>wordembeddings</span></a> <a href="https://sigmoid.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a> <span class="h-card" translate="no"><a href="https://fedihum.org/@sourisnumerique" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>sourisnumerique</span></a></span> <span class="h-card" translate="no"><a href="https://sigmoid.social/@enorouzi" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>enorouzi</span></a></span> <span class="h-card" translate="no"><a href="https://sigmoid.social/@fizise" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>fizise</span></a></span></p>
Sören Auer 🇪🇺🇺🇦<p>Ask your (research) question against 76 Million scientific articles: <a href="https://ask.orkg.org" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">ask.orkg.org</span><span class="invisible"></span></a></p><p><span class="h-card" translate="no"><a href="https://openbiblio.social/@orkg" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>orkg</span></a></span> ASK (Assistant for Scientific Knowledge) uses vector <a href="https://mstdn.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a> to find the most relevant papers and an open-source <a href="https://mstdn.social/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> to synthesize the answer for you.</p>
Harald Sack<p>Interesting new survey paper on ontology embeddings:<br>Jiaoyan Chen, Olga Mashkova, Fernando Zhapa-Camacho, Robert Hoehndorf, Yuan He, Ian Horrocks, Ontology Embedding: A Survey of Methods, Applications and Resources</p><p><a href="https://arxiv.org/abs/2406.10964" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/abs/2406.10964</span><span class="invisible"></span></a></p><p><a href="https://sigmoid.social/tags/ontologies" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ontologies</span></a> <a href="https://sigmoid.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a> <a href="https://sigmoid.social/tags/knowledgegraphs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>knowledgegraphs</span></a></p>
Harald Sack<p>2nd add on to our free MOOC lecture series on <a href="https://sigmoid.social/tags/KnowledgeGraphs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>KnowledgeGraphs</span></a> is a colab notebook on knowledge graph completion with TransE through which my colleagues Ann Tan and <span class="h-card" translate="no"><a href="https://sigmoid.social/@MahsaVafaie" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>MahsaVafaie</span></a></span> will guide you in the video.<br><a href="https://sigmoid.social/tags/OpenHPI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenHPI</span></a> video: <a href="https://open.hpi.de/courses/knowledgegraphs2023/items/48Sn5Tr9RKo24RXu7OwgOz" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">open.hpi.de/courses/knowledgeg</span><span class="invisible">raphs2023/items/48Sn5Tr9RKo24RXu7OwgOz</span></a><br>youtube video: <a href="https://www.youtube.com/watch?v=IVTVzgCbHOw&amp;list=PLNXdQl4kBgzubTOfY5cbtxZCgg9UTe-uF&amp;index=67" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">youtube.com/watch?v=IVTVzgCbHO</span><span class="invisible">w&amp;list=PLNXdQl4kBgzubTOfY5cbtxZCgg9UTe-uF&amp;index=67</span></a><br>colab notebook: <a href="https://colab.research.google.com/drive/104ad-kusmzfYgkK_L8ETWUAjfdreE9e2#scrollTo=fQ08XPbaZgQ4" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">colab.research.google.com/driv</span><span class="invisible">e/104ad-kusmzfYgkK_L8ETWUAjfdreE9e2#scrollTo=fQ08XPbaZgQ4</span></a></p><p><span class="h-card" translate="no"><a href="https://sigmoid.social/@fizise" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>fizise</span></a></span> <span class="h-card" translate="no"><a href="https://wisskomm.social/@fiz_karlsruhe" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>fiz_karlsruhe</span></a></span> <a href="https://sigmoid.social/tags/semanticweb" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>semanticweb</span></a> <a href="https://sigmoid.social/tags/kge" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>kge</span></a> <a href="https://sigmoid.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a> <a href="https://sigmoid.social/tags/linkprediction" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>linkprediction</span></a> <a href="https://sigmoid.social/tags/videolecture" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>videolecture</span></a> <a href="https://sigmoid.social/tags/video" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>video</span></a></p>
Harald Sack<p>Knowledge Graph Embeddings (KGEs) are a very useful tool for few- and zero-shot learning. Of course Link Prediction and <a href="https://sigmoid.social/tags/KnowledgeGraph" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>KnowledgeGraph</span></a> Completion are the most prominent tasks for KGEs. My colleague Ann Tan and I will start our investigation of KGEs in this section of our free <a href="https://sigmoid.social/tags/kg2023" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>kg2023</span></a> lecture.<br>OpenHPI video: <a href="https://open.hpi.de/courses/knowledgegraphs2023/items/3xfeKrryLMeY45OXSwBd86" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">open.hpi.de/courses/knowledgeg</span><span class="invisible">raphs2023/items/3xfeKrryLMeY45OXSwBd86</span></a><br>youtube video: <a href="https://www.youtube.com/watch?v=UGmtYSCXsQk&amp;list=PLNXdQl4kBgzubTOfY5cbtxZCgg9UTe-uF&amp;index=62" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">youtube.com/watch?v=UGmtYSCXsQ</span><span class="invisible">k&amp;list=PLNXdQl4kBgzubTOfY5cbtxZCgg9UTe-uF&amp;index=62</span></a><br>slides: <a href="https://zenodo.org/records/10185280" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">zenodo.org/records/10185280</span><span class="invisible"></span></a><br><span class="h-card" translate="no"><a href="https://fedihum.org/@tabea" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>tabea</span></a></span> <span class="h-card" translate="no"><a href="https://fedihum.org/@sashabruns" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>sashabruns</span></a></span> <span class="h-card" translate="no"><a href="https://sigmoid.social/@MahsaVafaie" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>MahsaVafaie</span></a></span> <span class="h-card" translate="no"><a href="https://wisskomm.social/@fiz_karlsruhe" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>fiz_karlsruhe</span></a></span> <span class="h-card" translate="no"><a href="https://sigmoid.social/@fizise" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>fizise</span></a></span> <a href="https://sigmoid.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a> <a href="https://sigmoid.social/tags/linkprediction" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>linkprediction</span></a></p>
Zane Selvans<p>I also really enjoyed this introduction to embeddings and the LLM tooling that's growing up around them. I'm hopeful we might actually be able to build a retrieval augmented generation system for utility commission regulatory filings now! Though we would probably need to find some substantial funding to calculate the embeddings for tens of millions of pages of PDFs...</p><p><a href="https://simonwillison.net/2023/Oct/23/embeddings/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">simonwillison.net/2023/Oct/23/</span><span class="invisible">embeddings/</span></a></p><p><a href="https://social.coop/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> <a href="https://social.coop/tags/Embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Embeddings</span></a></p>
Tim Kellogg<p>i wish i knew more about comparing <a href="https://hachyderm.io/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a>. anyone have resources? one thing i’ve wondered is how to convert an embedding from a “point” to an “area” or “volume”. e.g. an embedding of a 5 paragraph essay will occupy a single point in embedding space, but if you broke it down (e.g. by paragraph), there would be several points and the whole would presumably be at the center. is there a way to trace the full space a text occupies in <a href="https://hachyderm.io/tags/embedding" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embedding</span></a> space? <a href="https://hachyderm.io/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://hachyderm.io/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> <a href="https://hachyderm.io/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://hachyderm.io/tags/NLP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NLP</span></a></p>
Harald Sack<p>Many new and interesting topics in our upcoming <a href="https://sigmoid.social/tags/KnowledgeGraphs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>KnowledgeGraphs</span></a> - Foundations and Applications online lecture at <a href="https://sigmoid.social/tags/openhpi" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>openhpi</span></a> <br>- knowledge representation with graphs<br>- <a href="https://sigmoid.social/tags/RDF" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RDF</span></a> &amp; RDFS<br>- Querying RDF with <a href="https://sigmoid.social/tags/SPARQL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SPARQL</span></a> &amp; <a href="https://sigmoid.social/tags/SHACL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SHACL</span></a><br>- <a href="https://sigmoid.social/tags/OWL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OWL</span></a> &amp; Description Logics<br>- ontological engineering<br>- knowledge graph <a href="https://sigmoid.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a><br>- large <a href="https://sigmoid.social/tags/languagemodels" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>languagemodels</span></a> <a href="https://sigmoid.social/tags/llm" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llm</span></a> <br>Free registration open: <a href="https://open.hpi.de/courses/knowledgegraphs2023" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">open.hpi.de/courses/knowledgeg</span><span class="invisible">raphs2023</span></a><br><span class="h-card" translate="no"><a href="https://fedihum.org/@tabea" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>tabea</span></a></span> <span class="h-card" translate="no"><a href="https://fedihum.org/@sashabruns" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>sashabruns</span></a></span> <span class="h-card" translate="no"><a href="https://sigmoid.social/@fizise" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>fizise</span></a></span> <a href="https://sigmoid.social/tags/mooc" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>mooc</span></a> <a href="https://sigmoid.social/tags/lecture" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>lecture</span></a></p>
Harald Sack<p>Keynote by Heiko Paulheim (still not arrived here in the <a href="https://sigmoid.social/tags/fediverse" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>fediverse</span></a>) at RuleML+RR 2023 in Oslo. Talk: "Knowledge Graph Embeddings meet Symbolic Schemas, or: what do they Actually Learn?" Slides: <a href="https://www.uni-mannheim.de/media/Einrichtungen/dws/Files_People/Profs/heiko/talks/RuleMLRR_2023.pdf" rel="nofollow noopener" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">uni-mannheim.de/media/Einricht</span><span class="invisible">ungen/dws/Files_People/Profs/heiko/talks/RuleMLRR_2023.pdf</span></a><br><a href="https://sigmoid.social/tags/knowledgegraph" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>knowledgegraph</span></a> <a href="https://sigmoid.social/tags/semanticweb" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>semanticweb</span></a> <a href="https://sigmoid.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a> <a href="https://sigmoid.social/tags/graphembeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>graphembeddings</span></a> <a href="https://sigmoid.social/tags/rdf2vec" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>rdf2vec</span></a> <a href="https://sigmoid.social/tags/ruleml" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ruleml</span></a>+rr of course, this slide had to be from Heiko 😉 <a href="https://sigmoid.social/tags/montypython" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>montypython</span></a> @heikopaulheim@twitter.com</p>
Harald Sack<p>Of course I know RDF2vec ;-)<br>However, Heiko Paulheim's RDF2vec website has grown nicely and has become a rather valuable resource for this <a href="https://sigmoid.social/tags/knowledgegraph" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>knowledgegraph</span></a> embedding method, including implementations, models and services, variations and extensions, as well as more than 150 references in scientific papers!<br>RDF2vec website: <a href="http://www.rdf2vec.org/" rel="nofollow noopener" target="_blank"><span class="invisible">http://www.</span><span class="">rdf2vec.org/</span><span class="invisible"></span></a><br>original paper: <a href="https://madoc.bib.uni-mannheim.de/41307/1/Ristoski_RDF2Vec.pdf" rel="nofollow noopener" target="_blank"><span class="invisible">https://</span><span class="ellipsis">madoc.bib.uni-mannheim.de/4130</span><span class="invisible">7/1/Ristoski_RDF2Vec.pdf</span></a><br>Petar Ristoski's PhD thesis (with RDF2vec): <a href="https://ub-madoc.bib.uni-mannheim.de/43730/1/Ristoski_PhDThesis_final.pdf" rel="nofollow noopener" target="_blank"><span class="invisible">https://</span><span class="ellipsis">ub-madoc.bib.uni-mannheim.de/4</span><span class="invisible">3730/1/Ristoski_PhDThesis_final.pdf</span></a><br><a href="https://sigmoid.social/tags/kge" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>kge</span></a> <a href="https://sigmoid.social/tags/deeplearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>deeplearning</span></a> <a href="https://sigmoid.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a></p>
Tero Keski-Valkama<p>"List some example questions that the provided content answers."</p><p>Did you know that you can easily use <a href="https://rukii.net/tags/chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>chatbots</span></a> to generate phrases for a <a href="https://rukii.net/tags/VectorIndex" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VectorIndex</span></a>? <a href="https://rukii.net/tags/Embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Embeddings</span></a> of example questions match way better to embeddings of user queries than whole document embeddings or keyword embeddings.</p><p>They aren't mutually exclusive though – it makes sense to use all those embeddings for a single document, and obviously multiple embeddings of each type (except for the whole document embedding obviously).</p><p>And once more, never use embeddings significantly larger than 128 dimensions.</p>
Harald Sack<p>As a 2nd topic of this last <a href="https://sigmoid.social/tags/ise2023" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ise2023</span></a> lecture, we were discussing <a href="https://sigmoid.social/tags/KnowledgeGraph" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>KnowledgeGraph</span></a> Completion. Most simple approach for unsupervised <a href="https://sigmoid.social/tags/linkprediction" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>linkprediction</span></a> based on (here translation-based) knowledge graph embeddings was explained on the example of Isaac Asimov. <br>Slides: <a href="https://drive.google.com/file/d/1atNvMYNkeKDwXP3olHXzloa09S5pzjXb/view?usp=drive_link" rel="nofollow noopener" target="_blank"><span class="invisible">https://</span><span class="ellipsis">drive.google.com/file/d/1atNvM</span><span class="invisible">YNkeKDwXP3olHXzloa09S5pzjXb/view?usp=drive_link</span></a><br><span class="h-card"><a href="https://sigmoid.social/@fizise" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>fizise</span></a></span> <span class="h-card"><a href="https://sigmoid.social/@enorouzi" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>enorouzi</span></a></span> <a href="https://sigmoid.social/tags/scifi" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>scifi</span></a> <a href="https://sigmoid.social/tags/knowledgegraphs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>knowledgegraphs</span></a> <a href="https://sigmoid.social/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai</span></a> <a href="https://sigmoid.social/tags/deeplearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>deeplearning</span></a> <a href="https://sigmoid.social/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a></p>
Tero Keski-Valkama<p>How can you use query vector <a href="https://rukii.net/tags/embeddings" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>embeddings</span></a> to save costs with <a href="https://rukii.net/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> <a href="https://rukii.net/tags/chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>chatbots</span></a>? No, do not cache by vector embedding and return the same result for similar queries. This is very rarely what you want. I saw some recent library to do that. Horrible.</p><p>It will only make the bot return the same result no matter if the user is searching for Volvos or for Volkswagens.</p><p>What you should do instead is:</p><p>Wherever you have a decision point where the bot needs to decide whether the query is of this type or that, you can take that simple classification decision out and use query vector embeddings to make that decision quickly and cheaply, even locally without an API call.</p><p>This allows you to make faster interactions, decreasing the number of chatbot API calls, and significantly reduce the number of tokens in the prompt because now the prompt needs to only handle a single case where before it needed to handle multiple alternatives.</p><p>Caching is good otherwise, to return the same API result for the exact same query, because this doesn't break the expectations and makes stuff more efficient and cheap. To improve the cache hit rates you should produce query completions which nudge the users to use the same queries which have already been run. Just make sure you don't leak query data between users.</p>