[LangExtract](https://developers.googleblog.com/en/introducing-langextract-a-gemini-powered-information-extraction-library/) has got me curious, but I don't get what makes it different from a [spacy-llm/prodigy](https://prodi.gy/docs/large-language-models) setup. Is it just that I am spared the effort of chunking long input and/or constructing output JSON from entities and offsets by writing the corresponding python code myself?...
Ah, one more difference is that langextract is #OpenSource whereas prodigy is not (?). (On the other hand, prodigy has a better integration with a correction+training workflow.)
Predictive Testimony explores how compiled syntax in AI-generated police reports and judicial narratives redefines testimony as a syntactic construct, not observation. Key reading for #MedicalNLP and #LegalTech.
https://zenodo.org/records/16695136
New article published: Syntax Without Subject
What happens when AI writes rules but removes the speaker?
This study tracks how LLMs erase the subject from legal, medical, and policy texts.
We call this structural delegation. https://zenodo.org/records/16571077
#MedicalNLP #LegalTech
#MedTech #AIethics #AIgovernance #healthcare #ArtificialIntelligence #NLP #aifutures #LawFedi #lawstodon #tech #finance #business #agustinvstartari #medical #linguistics #ai #LRM #ClinicalAI #politics
New article published: Syntax Without Subject
What happens when AI writes rules but removes the speaker?
This study tracks how LLMs erase the subject from legal, medical, and policy texts.
We call this structural delegation. https://zenodo.org/records/16571077
#MedicalNLP #LegalTech
#MedTech #AIethics #AIgovernance #healthcare #ArtificialIntelligence #NLP #aifutures #LawFedi #lawstodon #tech #finance #business #agustinvstartari #medical #linguistics #ai #LRM #ClinicalAI #politics
This is a Ficaria verna (formerly Ranunculus ficaria L.), or lesser celandine or pilewort (as per Wikipedia and other sites).
I have tested gemma 3 4b-it-q4_0 multimodal vision model for a while: it is not accurate, and can't be trusted.
For instance, it thinks it is a Ranunculus acris (it isn't). It's really hit and miss with this model. I guess it could still be useful to provide some clues or vocabulary.
New published article:
Syntax Without Subject: Structural Delegation and the Disappearance of Political Agency in LLM-Governed Contexts
https://zenodo.org/records/16571077
This study introduces the concept of structural delegation to explain how large language models produce legally
#MedicalNLP #LegalTech
#MedTech #AIethics #AIgovernance #healthcare #ArtificialIntelligence #NLP #aifutures #LawFedi #lawstodon #tech #finance #business #agustinvstartari #medical #linguistics #ai #LRM #ClinicalAI #politics
New article: Sovereign Syntax in Financial Disclosure
How LLMs simulate trust using grammar, not facts. SDRI = a syntactic risk index for crypto whitepapers.
Audit language structure, not just claims. https://zenodo.org/records/16421548| https://papers.ssrn.com/abstract=5366241
New paper out: Sovereign Syntax in Financial Disclosure LLMs simulate legitimacy via grammar.
SDRI = structural index to detect risk in crypto texts.
https://zenodo.org/records/16421548
https://papers.ssrn.com/abstract=5366241
#LLM #AIethics #ai #finance #agustinvstartari #LLM #MedicalNLP #LegalTech #MedTech #AIethics #AIgovernance #cryptoreg #healthcare #ArtificialIntelligence #NLP #aifutures #LawFedi #lawstodon #tech #finance #business #agustinvstartari #medical #ai #LRM #ClinicalAI #politics #regulation
New paper out:
When Grammar Fails the Audit: How One Bad Sentence Can Cost Your Company Millions
Solution: fair-syntax transformation → reduces errors by 15%.
Read: https://zenodo.org/records/16322760
Agustin V. Startari
#LLM #MedicalNLP #LegalTech #MedTech #AIethics #AIgovernance #cryptoreg #healthcare #ArtificialIntelligence #NLP #aifutures #LawFedi #lawstodon #tech #finance #business #agustinvstartari #medical #linguistics #ai #LRM #ClinicalAI #politics #regulation
New study: Expense Coding Syntax explores how LLM based ERP accounts mislabel costs and elevate audit risk.
SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5361952
Zenodo: https://zenodo.org/records/16325643
What is #NLP research 𝘳𝘦𝘢𝘭𝘭𝘺 about?
We analyzed 29k+ papers to find out!
Our NLPContributions dataset, from the ACL Anthology, reveals what authors actually contribute—artifacts, insights, and more.
Trends show a swing back towards language & society. Curious where you fit in?
Tools, data, and analysis await you:
Paper: https://arxiv.org/abs/2409.19505
Project: https://ukplab.github.io/acl25-nlp-contributions/
Code: https://github.com/UKPLab/acl25-nlp-contributions
Data: https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/4678
(1/)
Are there any data on inter-annotator agreement for all the treebanks and other linguistic data used to train #nlp models? #linguistics
Gemini Deep Think learns math, wins gold medal at International Math Olympiad
https://arstechnica.com/ai/2025/07/google-deepmind-earns-gold-in-international-math-olympiad-with-new-gemini-ai/
#AI #artificialintelligence #Google #DeepMind #DeepThink #mathematics #NLP