lingo.lol is one of the many independent Mastodon servers you can use to participate in the fediverse.
A place for linguists, philologists, and other lovers of languages.

Server stats:

63
active users

#GPT3

0 posts0 participants0 posts today

AI trained on AI garbage spits out AI garbage
technologyreview.com/2024/07/2

AI trained on too much AI-generated data produces gibberish
Generative AI models models can collapse if training data contains too much AI-generated content
nature.com/articles/d41586-024

AI models collapse when trained on recursively generated data
nature.com/articles/s41586-024

* indiscriminate use of model-generated content in training causes irreversible defects in resulting models

MIT Technology Review · AI trained on AI garbage spits out AI garbageBy Scott J Mulligan
#AI#ML#NLP

En train de me préparer une liste de blocage de hashtags destinée à nettoyer ma TL de tous les trucs sur les "IA génératives" (vraiment jpp, y'a des moments où ça ne parle que de ça…).
Actuellement j'en suis à : #AI #IA, #LLM, #ChatGPT, #GPT #GPT3, #GPT4, #GPT5 (oui je prends de l'avance), #GoogleGemini, #Copilot, #Bard, #BingChat, #LLama, #Mistral.

Vous en voyez d'autres ?

J'hésite à mettre #Gemini mais j'ai peur que ça bloque des pouets sur le protocole…

Just read Nathan Jones’s 'Experiential Literature?’

apria.artez.nl/experiential-li

It explores the differences between texts authored/co-authored by GPT-3, such as K. Allado-McDowell’s Pharmako-AI, and human-authored experimental literature. Drawing on his concept of ‘glitch poetics’ (openhumanitiespress.org/books/), Jones pays special attention to those experiments that arise from working with error in literature:

‘One of the problems with the concept of glitch poetics is that there is never a right functioning literature with which we can measure literary disfunction. … I suggest that GPT-3’s boring, baseline normative writing behaviour makes it ideal for this purpose.’

Certainly made me want to go back to reread Olga Ravn’s haunting experimental sci-fi novel The Employees.

"In this work, we analyze what confuses GPT-3: how the model responds to certain sensitive topics and what effects the prompt wording has on the model response. We find that GPT-3 correctly disagrees with obvious Conspiracies and Stereotypes but makes mistakes with common Misconceptions and Controversies. The model responses are inconsistent across prompts and settings, highlighting GPT-3's unreliability."

Khatun, A. and Brown, D.G. (2023) 'Reliability Check: an analysis of GPT-3’s response to sensitive topics and prompt wording,' arXiv (Cornell University) [Preprint]. doi.org/10.48550/arxiv.2306.06 #AI #ArtificialIntelligence #ComputerScience #LLM #GPT3 #MachineLearning

arXiv.orgReliability Check: An Analysis of GPT-3's Response to Sensitive Topics and Prompt WordingLarge language models (LLMs) have become mainstream technology with their versatile use cases and impressive performance. Despite the countless out-of-the-box applications, LLMs are still not reliable. A lot of work is being done to improve the factual accuracy, consistency, and ethical standards of these models through fine-tuning, prompting, and Reinforcement Learning with Human Feedback (RLHF), but no systematic analysis of the responses of these models to different categories of statements, or on their potential vulnerabilities to simple prompting changes is available. In this work, we analyze what confuses GPT-3: how the model responds to certain sensitive topics and what effects the prompt wording has on the model response. We find that GPT-3 correctly disagrees with obvious Conspiracies and Stereotypes but makes mistakes with common Misconceptions and Controversies. The model responses are inconsistent across prompts and settings, highlighting GPT-3's unreliability. Dataset and code of our analysis is available in https://github.com/tanny411/GPT3-Reliability-Check.

Everybody’s talking about Mistral, an upstart French challenger to OpenAI

On Monday, Mistral AI announced a new AI language model called Mixtral 8x7B, a "mixture of experts" (MoE) model with open weights that reportedly truly matches OpenAI's GPT-3.5 in performance—an achievement that has been claimed by others in the past but is being taken seriously by AI heavyweights such as OpenAI's Andrej... #llm #largelanguagemodels #languagemodels #artificialintelligence #ai #openai #llama #chatgpt #gpt4 #gpt3 #huggingface #mistral #mixtral #tech

arstechnica.com/information-te

Ars Technica · Everybody’s talking about Mistral, an upstart French challenger to OpenAI"Mixture of experts" Mixtral 8x7B helps open-weights AI punch above its weight class.
Continued thread

Addendum 1

AutoML-GPT: Large Language Model for AutoML
arxiv.org/abs/2309.01125

* AutoML-GPT integrates comprehensive set of tools & libraries
* grants access to wide range of data preprocessing, feature engineering, & model selection algorithms
* conversational interface: users can specify requirements, constraints, evaluation metrics
* manages complexity of ML pipeline, sig. reduces time/effort req'd

#AutoML #AutoMLGPT GPT #GPT3 #GPT4 #LargeLanguageModels #LLM #LargeLanguageModels #ChatGPT

Next stop on our brief #timeline of (Large) #LanguageModels is 2022:
InstructGPT is introduced by OpenAI, a GPT-3 model complemented and fine-tuned with reinforcement learning from human feedback.
ChatGPT is introduced by OpenAI as a combination of GPT-3, Codex, and InstructGPT including lots of additional engineering.
#ise2023 lecture slides: drive.google.com/file/d/1atNvM
RLHF explaned: huggingface.co/blog/rlhf
#ai #creativeai #rlhf #gpt3 #gpt #openai #chatgpt #lecture #artificialintelligence #llm

Replied in thread

Addendae 7

MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in LLM
arxiv.org/abs/2308.09729

* addendum 4
* prompt LLM w. knowledge graphs
* engages LLM w. ext. knowledge; elicits reasoning pathways
* prompting endows LLM capable of comprehending KG inputs
* mind map on which LLMs perform reasoning, generate answers
* ontology of knowledge
* GPT-3.5 prompted w. MindMap consistently outperforms GPT-4

The take that #gpt3 "has mastered analogical reasoning" is dangerous #bullshit. It's a predictive system, it understands nothing. What's going on is that analogies happen to be highly predictable. I can tell you from personal experience that it's dead easy to generate analogies without really understanding them. That's how I got 98th+ percentile on the SATs.
popsci.com/technology/gpt-3-la
[h/t @librarianshipwreck | mastodon.social/@librarianship]

Popular ScienceGPT-3 is pretty good at taking the SATsIt has somehow mastered analogical reasoning, which has long been thought to be a 'uniquely human ability.'

Investigating "Secret Language"(SL) in Language Models
arxiv.org/abs/2307.12507

LMs seem to have hidden vocabulary allowing interpreting absurd inputs as meaningful concepts.

* SL phenomenon exists in different LM? Yes: replacing important words w. semantically dissimilar words sentences, LMs do not consider new sentence semantically dissimilar

* SL depends on specific context? No: general & could be transferred to other black-box models GPT-3; ChatGPT)

arXiv.orgInvestigating the Existence of "Secret Language'' in Language ModelsIn this paper, we study the problem of secret language in NLP, where current language models (LMs) seem to have a hidden vocabulary that allows them to interpret absurd inputs as meaningful concepts. We investigate two research questions: ``Does the secret language phenomenon exist in different language models?'' and ``Does secret language depend on specific context?'' To answer these questions, we introduce a novel method named \textit{SecretFinding}, a gradient-based approach that can automatically discover secret languages in LMs. We conduct experiments on five representative models (Electra, ALBERT, Roberta, DistillBERT, and CLIP) finetuned on four NLP benchmarks (SST-2, MRPC, SNLI, and SQuAD) and a language-grounding benchmark (MSCOCO). Our experimental results show that even when we replace the most important words with others that are semantically dissimilar to the original words in a sentence, LMs do not consider the new sentence semantically dissimilar to the original, as the output does not change with a high probability. This phenomenon holds true across the five models and five tasks and gives a positive answer to the first research question. As for the second research question, we find that the secret language discovered by \textit{SecretFinding} is quite general and could even be transferred to other models in the black-box settings, such as GPT-3 and ChatGPT. Finally, we discuss the causes of secret language, how to eliminate it, the potential connection to memorization, and ethical implications. Examples of secret language found by SecretFinding are available on https://huggingface.co/spaces/anonymousauthors/ACL23_SecretLanguage.

Trying myself on unknown terrain: Just published a working paper about the use of Large Language Models for low-resource programming languages! 🖥️👋

The study shows that #LLM-s can be a useful for writing, understanding, improving and documenting code.

I choose #gretl + its domain-specific scripting language for #statistics + #econometrics for illustration.

Comments welcome!
arxiv.org/abs/2307.13018

#softwaredevelopment #computerscience #GPT3.5 #econtwitter

@gretl

arXiv.orgThe potential of LLMs for coding with low-resource and domain-specific programming languagesThis paper presents a study on the feasibility of using large language models (LLM) for coding with low-resource and domain-specific programming languages that typically lack the amount of data required for effective LLM processing techniques. This study focuses on the econometric scripting language named hansl of the open-source software gretl and employs a proprietary LLM based on GPT-3.5. Our findings suggest that LLMs can be a useful tool for writing, understanding, improving, and documenting gretl code, which includes generating descriptive docstrings for functions and providing precise explanations for abstract and poorly documented econometric code. While the LLM showcased promoting docstring-to-code translation capability, we also identify some limitations, such as its inability to improve certain sections of code and to write accurate unit tests. This study is a step towards leveraging the power of LLMs to facilitate software development in low-resource programming languages and ultimately to lower barriers to entry for their adoption.

Availability and Framing biases detected when prompting #BERT large language model:

“On a drug-drug interaction extraction task, [we detect] an error pattern similar to the availability bias when the labels for training prompts are imbalanced, and show that a toning-down transformation of the drug-drug description in a prompt can elicit a bias similar to the framing effect….”

aclanthology.org/2023.findings

#AI#LLM#GPT3

Can I say, now machines can think?
arxiv.org/abs/2307.07526

Generative AI allows machines to produce images, generate answers/stories, & write code based on the user prompts - generating human-like responses

The Turing Test is a critical aspect of evaluating machines' abilities. There are other aspects of intelligence, & AI machines exhibit most of these aspects.

Note Figs. 2, 3 😯

arXiv.orgCan I say, now machines can think?Generative AI techniques have opened the path for new generations of machines in diverse domains. These machines have various capabilities for example, they can produce images, generate answers or stories, and write codes based on the "prompts" only provided by users. These machines are considered 'thinking minds' because they have the ability to generate human-like responses. In this study, we have analyzed and explored the capabilities of artificial intelligence-enabled machines. We have revisited on Turing's concept of thinking machines and compared it with recent technological advancements. The objections and consequences of the thinking machines are also discussed in this study, along with available techniques to evaluate machines' cognitive capabilities. We have concluded that Turing Test is a critical aspect of evaluating machines' ability. However, there are other aspects of intelligence too, and AI machines exhibit most of these aspects.