lingo.lol is one of the many independent Mastodon servers you can use to participate in the fediverse.
A place for linguists, philologists, and other lovers of languages.

Server stats:

64
active users

#textmining

0 posts0 participants0 posts today

📯 Diese Woche im #DigitalHistoryOFK: Torsten Hiltmann und @DigHisNoah präsentieren "RAG den Spiegel" – ein innovatives RAG-System zur Analyse des SPIEGEL-Archivs. Der Vortrag zeigt, wie #LLMs Geschichtswissenschaft verändern und hermeneutische mit computationellen Methoden verbinden.
📅 25. Juni, 16-18 Uhr, online (Zugang auf Anfrage)
ℹ️ Abstract: dhistory.hypotheses.org/10912 #TextMining #4memory #DigitalHistory @historikerinnen @histodons @digitalhumanities

Folks working in the #DigitalHumanities or #TextMining and related research fields, a technical question: do you use a database management system (DBMS) to store your data? Or do you use good old JSON or CSV files on local drives? If the first, what do you use (Postgres, MySQL, Mongo)? If the second, how do you sync your files to enable collaboration on the same data?

I'm starting a new project, and from past experience I think it would be best to set up a managed DB from the beginning, instead of using JSON files. That way my team has access to the same data and we can query the specific data we need for some analysis.

Open Access book edited by Silke Schwandt: Digital Methods in the Humanities.
Explore interdisciplinary challenges, case studies, and innovative perspectives on digital tools in textual research.
Includes: From Serial Sources to Modeled Data, OCR, text mining & more.
transcript-verlag.de/978-3-837
#DigitalHumanities #OpenAccess #DigitalMethods #TextMining #HumanitiesResearch #SilkeSchwandt #transcriptVerlag

transcript VerlagDigital Methods in the HumanitiesVolume 1 of »Digital Humanities Research« offers a unique perspective on digital methods for and in the humanities.

Code4Lib: Distant Listening: Using Python and Apps Scripts to Text Mine and Tag Oral History Collections. “Designed for oral history project managers, the workflow empowers student workers to generate, modify, and expand subject tags during transcription editing, thereby enhancing the overall accuracy and discoverability of the collection. The paper details the workflow, surveys challenges […]

https://rbfirehose.com/2025/04/15/distant-listening-using-python-and-apps-scripts-to-text-mine-and-tag-oral-history-collections-code4lib/

Vom #Archiv zur #Datenbank. Was #TextMining und #GraphModelling Verfahren zu einer vergleichenden #Sozialgeschichte des Zwangs im #Spätmittelalter beitragen können: Juliane Schiel (Univ. Wien) beim nächsten #Jeudi-Vortrag mit Kommentar von Simona Cerutti (EHESS)

10.04. | 18:00 | hybrid | DE-FR

dhi-paris.fr/veranstaltungsdet

@histodons #WORCK #DH #digitaleTextanalyse #DigitalHumanties #DigitalHistory

𝗧𝗲𝘅𝘁 𝗠𝗶𝗻𝗶𝗻𝗴 𝗪𝗼𝗿𝗸𝘀𝗵𝗼𝗽 (𝘁𝗳-𝗶𝗱𝗳), co-organised by HDSM, TU Darmstadt and UCL 📑
📌𝗗𝗮𝘁𝗲: 4th April
📌𝗣𝗹𝗮𝗰𝗲: Connected Environments Lab – Room 107 (First Floor)
UCL East , One Pool Street
Stratford, London
E20 2AF
For registration and other details:
ucl.ac.uk/digital-humanities/e

#workshop #tf-idf #digitalhumanities #textmining

UCL Centre for Digital Humanities · Text Mining with tf-idfTf-idf is an information retrieval method to extract distinctive keywords from documents. In this interactive workshop we will explore the possibilities and limitations of the technique.

🌍 Automating Nature Detection in Historical Travelogues?

At #Dhd2025 Michela Vignoli & Doris Gruber (ONiT Project) explore how #LLM Llama 3.1 70B can analyze nature representations in multilingual travel reports

⚠️ Challenges remain:
❌ LLMs always produces results—even with flawed data
❌ LLM-corrected texts did not improve searchability in vector databases (3–14% drop)
🔎 Conclusion: LLMs aids discovery but manual review is essential for a reliable dataset.

[Atelier Data] Le lab INA organise un atelier @iscpif le 12 mars à 17h30 consacré à l’exploration (#statistique, #TAL…) de transcriptions de JT TF1 et FR2
Il reste encore quelques places : framaforms.org/atelier-donnees

Une certaine autonomie avec les outils d'analyse quantitative (Python ou R, CSV, etc.) est nécessaire afin de pouvoir profiter pleinement de l'atelier.

framaforms.orgAtelier données INA | Framaforms.org

Auch 2025 setzt unser #DHELab seine Online-Vortragsreihe fort. Am 31.1.2025, 12-13 Uhr, stellen @ChristophSchindler @j_roeschlein & @alexchrist die WebApp #EduTopics vor, die mehr als 30.000 Beiträgen der #ECER ausgewertet & mit ihren bibliographischen Parametern interaktiv bereitgestellt & visualisiert, und gehen der Frage nach, wie #MachineLearning & #Textmining die Analyse großer Textkorpora und neue Zugänge zu Forschungsinformationen ermöglichen:
bbf.dipf.de/de/dhelab-202501
#digitalhumanities

Natural Language Processing (NLP) is a fascinating field at the intersection of data science, linguistics, and artificial intelligence. It focuses on enabling machines to understand, interpret, and respond meaningfully to human language.

#NLP #NaturalLanguageProcessing #DataScience #ArtificialIntelligence #MachineLearning #SentimentAnalysis #Chatbots #TextMining #Linguistics #AI #DeepLearning #DataScienceBeginner #LanguageModels

669f4ce3eee52.site123.me/blog/

Excited that Keli Du is going to be presenting work we talked about a lot: "Shifting Sentiments? What happens to BERT-based Sentiment Classification when derived text formats are used for fine-tuning".

Results show that, as long as you don't remove too much information from the texts, the performance stays at pretty acceptable levels, even when DTFs are used for fine-tuning.