Actually Useful AI

2544 readers
1 users here now

Welcome! 🤖

Our community focuses on programming-oriented, hype-free discussion of Artificial Intelligence (AI) topics. We aim to curate content that truly contributes to the understanding and practical application of AI, making it, as the name suggests, "actually useful" for developers and enthusiasts alike.

Be an active member! 🔔

We highly value participation in our community. Whether it's asking questions, sharing insights, or sparking new discussions, your engagement helps us all grow.

What can I post? 📝

In general, anything related to AI is acceptable. However, we encourage you to strive for high-quality content.

What is not allowed? 🚫

General Rules 📜

Members are expected to engage in on-topic discussions, and exhibit mature, respectful behavior. Those who fail to uphold these standards may find their posts or comments removed, with repeat offenders potentially facing a permanent ban.

While we appreciate focus, a little humor and off-topic banter, when tasteful and relevant, can also add flavor to our discussions.

Related Communities 🌐

General

Chat

Image

Open Source

Please message @sisyphean@programming.dev if you would like us to add a community to this list.

Icon base by Lord Berandas under CC BY 3.0 with modifications to add a gradient

founded 2 years ago
MODERATORS
126
 
 

Announcement

The bot I announced in this thread is now ready for a limited beta release.

You can see an example summary it wrote here.

How to Use AutoTLDR

  • Just mention it ("@" + "AutoTLDR") in a comment or post, and it will generate a summary for you.
  • If mentioned in a comment, it will try to summarize the parent comment, but if there is no parent comment, it will summarize the post itself.
  • If the parent comment contains a link, or if the post is a link post, it will summarize the content at that link.
  • If there is no link, it will summarize the text of the comment or post itself.
  • 🔒 If you include the #nobot hashtag in your profile, it will not summarize anything posted by you.

Beta limitations

How to try it

  • If you want to test the bot, write a long comment, or include a link in a comment in this thread, and then, in a reply comment, mention the bot.
  • Feel free to test it and try to break it in this thread. Please report any weird behavior you encounter in a PM to me (NOT the bot).
  • You can also use it for its designated purpose anywhere in the AUAI community.
127
128
129
2
On giving AI eyes and ears (www.oneusefulthing.org)
submitted 2 years ago* (last edited 2 years ago) by sisyphean@programming.dev to c/auai@programming.dev
 
 

TL;DR (by GPT-4 🤖)

The article discusses the evolution of AI beyond text-based chatbots, highlighting the emergence of multimodal AI, which can process different kinds of input, including images. This development allows AI to "see" and understand images, significantly enhancing its capabilities and enabling it to interact with the world in new ways. The article also mentions the integration of OpenAI's Whisper, a highly effective voice-to-text system, into the ChatGPT app, which changes how AI can be used, such as serving as an intelligent assistant. The author emphasizes that AI's growing capabilities, including internet connectivity, code execution, and the ability to watch and listen, have profound implications, necessitating a thoughtful consideration of both the benefits and concerns.

Notes (by GPT-4 🤖)

AI Evolution Beyond Text

  • AI has evolved beyond being just chatbots. New modes of AI usage have emerged, such as the write-it-for-me buttons in Google Docs, which seamlessly integrate AI into work processes.
  • These changes have significant implications for work and the meaning of writing.

Multimodal AI

  • The most advanced AI, GPT-4, is a multimodal AI, which means it can process different kinds of input, including images.
  • Multimodal AI allows the AI to "see" images and "understand" what it is seeing. This capability significantly enhances what AI can do, despite occasional errors and hallucinations.

AI Interaction with the World

  • Because AI can now "see," it can interact with the world in an entirely new way, with significant implications.
  • For instance, AI can now build and refine prototypes using vision, a substantial increase in capabilities.

AI Voice Recognition

  • OpenAI's Whisper is a highly effective voice-to-text system that is now part of the ChatGPT app on mobile phones.
  • This integration changes how AI can be used, such as serving as an intelligent assistant that can understand intent rather than just dictation.

AI in Education

  • Voice recognition can be useful in education, providing real-time presentation feedback.
  • For example, GPT-4 can act as a real-time virtual VC, providing feedback on startup pitches.

AI's Growing Capabilities

  • AI's knowledge and capabilities have expanded beyond just text and include internet connectivity, code execution, and now, the ability to watch and listen.
  • These advancements mean that jobs requiring visual or audio interactions are no longer insulated from AI.
  • The implications of these capabilities are profound, and there is a need to start considering both the benefits and concerns today.
130
16
Understanding GPT tokenizers (simonwillison.net)
submitted 2 years ago* (last edited 2 years ago) by sisyphean@programming.dev to c/auai@programming.dev
 
 

This is an excellent overview of tokenization with many interesting examples. I also like Simon's small CLI tools; you can read about them at the end of the post.

As usual, I've asked GPT-4 to write a TL;DR and detailed notes for it.

Notice that it couldn't print the "davidjl" glitch token, and (probably because of its presence), the notes are also incomplete. At first I thought it was because the text of the article was longer than the context window, but the TL;DR contains details the notes don't so that probably wasn't the case.

I've still decided to copy the notes here because they are generally useful and also demonstrate this weird behavior.

TL;DR (by GPT-4 🤖)

The article discusses the concept of tokenization in large language models like GPT-3/4, LLaMA, and PaLM. These models convert text into tokens (integers) and predict the next tokens. The author explains how English words are usually assigned a single token, while non-English languages often have less efficient tokenization. The article also explores "glitch tokens," which exhibit unusual behavior, and the necessity of counting tokens to ensure OpenAI's models' token limit is not exceeded. The author introduces a Python library called tiktoken and a command-line tool called ttok for this purpose. Understanding tokens can help make sense of how GPT tools generate text.

Notes (by GPT-4 🤖)

Understanding GPT Tokenizers

  • Large language models like GPT-3/4, LLaMA, and PaLM operate in terms of tokens, which are integers representing text. They convert text into tokens and predict the next tokens.
  • OpenAI provides a Tokenizer tool for exploring how tokens work. The author has also built a tool as an Observable notebook.
  • The notebook can convert text to tokens, tokens to text, and run searches against the full token table.

Tokenization Examples

  • English words are usually assigned a single token. For example, "The" is token 464, " dog" is token 3290, and " eats" is token 25365.
  • Capitalization and leading spaces are important in tokenization. For instance, "The" with a capital T is token 464, but " the" with a leading space and a lowercase t is token 262.
  • Languages other than English often have less efficient tokenization. For example, the Spanish sentence "El perro come las manzanas" is encoded into seven tokens, while the English equivalent "The dog eats the apples" is encoded into five tokens.
  • Some languages may have single characters that encode to multiple tokens, such as certain Japanese characters.

Glitch Tokens and Token Counting

  • There are "glitch tokens" that exhibit unusual behavior. For example, token 23282—"djl"—is one such glitch token. It's speculated that this token refers to a Reddit user who posted incremented numbers hundreds of thousands of times, and this username ended up getting its own token in the training data.
  • OpenAI's models have a token limit, and it's sometimes necessary to count the number of tokens in a string before passing it to the API to ensure the limit is not exceeded. OpenAI provides a Python library called tiktoken for this purpose.
  • The author also introduces a command-line tool called ttok, which can count tokens in text and truncate text down to a specified number of tokens.

Token Generation

  • Understanding tokens can help make sense of how GPT tools generate text. For example, names not in the dictionary, like "Pelly", take multiple tokens, but "Captain Gulliver" outputs the token "Captain" as a single chunk.
131
132
 
 

TL;DR (by GPT-4 🤖)

The article discusses the concept of building autonomous agents powered by Large Language Models (LLMs), such as AutoGPT, GPT-Engineer, and BabAGI. These agents use LLMs as their core controller, with key components including planning, memory, and tool use. Planning involves breaking down tasks into manageable subgoals and self-reflecting on past actions to improve future steps. Memory refers to the agent's ability to utilize short-term memory for in-context learning and long-term memory for retaining and recalling information. Tool use allows the agent to call external APIs for additional information. The article also discusses various techniques and frameworks for task decomposition and self-reflection, different types of memory, and the use of external tools to extend the agent's capabilities. It concludes with case studies of LLM-empowered agents for scientific discovery.

Notes (by GPT-4 🤖)

LLM Powered Autonomous Agents

  • The article discusses the concept of building agents with Large Language Models (LLMs) as their core controller, with examples such as AutoGPT, GPT-Engineer, and BabAGI. LLMs have the potential to be powerful general problem solvers.

Agent System Overview

  • The LLM functions as the agent’s brain in an LLM-powered autonomous agent system, complemented by several key components:
    • Planning: The agent breaks down large tasks into smaller subgoals and can self-reflect on past actions to improve future steps.
    • Memory: The agent utilizes short-term memory for in-context learning and long-term memory to retain and recall information over extended periods.
    • Tool use: The agent can call external APIs for extra information that is missing from the model weights.

Component One: Planning

  • Task Decomposition: Techniques like Chain of Thought (CoT) and Tree of Thoughts are used to break down complex tasks into simpler steps.
  • Self-Reflection: Frameworks like ReAct and Reflexion allow the agent to refine past action decisions and correct previous mistakes. Chain of Hindsight (CoH) and Algorithm Distillation (AD) are methods that encourage the model to improve on its own outputs.

Component Two: Memory

  • The article discusses the different types of memory in human brains and how they can be mapped to the functions of an LLM. It also discusses Maximum Inner Product Search (MIPS) for fast retrieval from the external memory.

Tool Use

  • The agent can use external tools to extend its capabilities. Examples include MRKL, TALM, Toolformer, ChatGPT Plugins, OpenAI API function calling, and HuggingGPT.
  • API-Bank is a benchmark for evaluating the performance of tool-augmented LLMs.

Case Studies

  • The article presents case studies of LLM-empowered agents for scientific discovery, such as ChemCrow and a system developed by Boiko et al. (2023). These agents can handle autonomous design, planning, and performance of complex scientific experiments.
133
 
 

👋 Hello everyone, welcome to our very first Weekly Discussion thread!

This week, we're focusing on the applications of AI that you've found particularly noteworthy.

We're not just looking for headline-making AI applications. We're interested in the tools that have made a real difference in your day-to-day routine, or a unique AI feature that you've found useful. Have you discovered a new way to utilize ChatGPT? Perhaps Stable Diffusion or Midjourney has helped you generate an image that you're proud of?

Let's share our knowledge and learn more about the various applications of AI. Looking forward to your contributions.

134
 
 

TL;DR (by GPT-4 🤖):

Prompt Engineering, or In-Context Prompting, is a method used to guide Language Models (LLMs) towards desired outcomes without changing the model weights. The article discusses various techniques such as basic prompting, instruction prompting, self-consistency sampling, Chain-of-Thought (CoT) prompting, automatic prompt design, augmented language models, retrieval, programming language, and external APIs. The effectiveness of these techniques can vary significantly among models, necessitating extensive experimentation and heuristic approaches. The article emphasizes the importance of selecting diverse and relevant examples, giving precise instructions, and using external tools to enhance the model's reasoning skills and knowledge base.

Notes (by GPT-4 🤖):

Prompt Engineering: An Overview

  • Introduction
    • Prompt Engineering, also known as In-Context Prompting, is a method to guide the behavior of Language Models (LLMs) towards desired outcomes without updating the model weights.
    • The effectiveness of prompt engineering methods can vary significantly among models, necessitating extensive experimentation and heuristic approaches.
    • This article focuses on prompt engineering for autoregressive language models, excluding Cloze tests, image generation, or multimodality models.
  • Basic Prompting
    • Zero-shot and few-shot learning are the two most basic approaches for prompting the model.
    • Zero-shot learning involves feeding the task text to the model and asking for results.
    • Few-shot learning presents a set of high-quality demonstrations, each consisting of both input and desired output, on the target task.
  • Tips for Example Selection and Ordering
    • Examples should be chosen that are semantically similar to the test example.
    • The selection of examples should be diverse, relevant to the test sample, and in random order to avoid biases.
  • Instruction Prompting
    • Instruction prompting involves giving the model direct instructions, which can be more token-efficient than few-shot learning.
    • Models like InstructGPT are fine-tuned with high-quality tuples of (task instruction, input, ground truth output) to better understand user intention and follow instructions.
  • Self-Consistency Sampling
    • Self-consistency sampling involves sampling multiple outputs and selecting the best one out of these candidates.
    • The criteria for selecting the best candidate can vary from task to task.
  • Chain-of-Thought (CoT) Prompting
    • CoT prompting generates a sequence of short sentences to describe reasoning logics step by step, leading to the final answer.
    • CoT prompting can be either few-shot or zero-shot.
  • Automatic Prompt Design
    • Automatic Prompt Design involves treating prompts as trainable parameters and optimizing them directly on the embedding space via gradient descent.
  • Augmented Language Models
    • Augmented Language Models are models that have been enhanced with reasoning skills and the ability to use external tools.
  • Retrieval
    • Retrieval involves completing tasks that require latest knowledge after the model pretraining time cutoff or internal/private knowledge base.
    • Many methods for Open Domain Question Answering depend on first doing retrieval over a knowledge base and then incorporating the retrieved content as part of the prompt.
  • Programming Language and External APIs
    • Some models generate programming language statements to resolve natural language reasoning problems, offloading the solution step to a runtime such as a Python interpreter.
    • Other models are augmented with text-to-text API calls, guiding the model to generate API call requests and append the returned result to the text sequence.
135
 
 

From the “About” section:

goblin.tools is a collection of small, simple, single-task tools, mostly designed to help neurodivergent people with tasks they find overwhelming or difficult.

Most tools will use AI technologies in the back-end to achieve their goals. Currently this includes OpenAI's models. As the tools and backend improve, the intent is to move to an open source alternative.

The AI models used are general purpose models, and so the accuracy of their output can vary. Nothing returned by any of the tools should be taken as a statement of truth, only guesswork. Please use your own knowledge and experience to judge whether the result you get is valid.

136
 
 

Original tweet:

https://twitter.com/goodside/status/1672121754880180224?s=46&t=OEG0fcSTxko2ppiL47BW1Q

Text:

If you put violence, erotica, etc. in your code Copilot just stops working and I happen to need violence, erotica, etc. in Jupyter for red teaming so I always have to make an evil.⁠py to sequester constants for import.

not wild about this. please LLMs i'm trying to help you

(screenshot of evil.py full of nasty things)

137
 
 

Here is the link to the example epubs:

https://github.com/mshumer/gpt-author/tree/main/example_novel_outputs

I’m not sure how I feel about this project.

138
 
 

TL;DR (by GPT-4 🤖):

The article titled "It’s infuriatingly hard to understand how closed models train on their input" discusses the concerns and lack of transparency surrounding the training data used by large language models like GPT-3, GPT-4, Google's PaLM, and Anthropic's Claude. The author expresses frustration over the inability to definitively state that private data passed to these models isn't being used to train future versions due to the lack of transparency from the vendors. The article also highlights OpenAI's policy that data submitted by API users is not used to train their models or improve their services. However, the author points out that the policy is relatively new and data submitted before March 2023 may have been used if the customer hadn't opted out. The article also brings up potential security risks with AI vendors logging inputs and the possibility of data breaches. The author suggests that openly licensed models that can be run on personal hardware may be a solution to these concerns.

139
 
 

It's coming along nicely, I hope I'll be able to release it in the next few days.

Screenshot:

How It Works:

I am a bot that generates summaries of Lemmy comments and posts.

  • Just mention me in a comment or post, and I will generate a summary for you.
  • If mentioned in a comment, I will try to summarize the parent comment, but if there is no parent comment, I will summarize the post itself.
  • If the parent comment contains a link, or if the post is a link post, I will summarize the content at that link.
  • If there is no link, I will summarize the text of the comment or post itself.

Extra Info in Comments:

Prompt Injection:

Of course it's really easy (but mostly harmless) to break it using prompt injection:

It will only be available in communities that explicitly allow it. I hope it will be useful, I'm generally very satisfied with the quality of the summaries.

140
 
 

Link to original tweet:

https://twitter.com/sayashk/status/1671576723580936193?s=46&t=OEG0fcSTxko2ppiL47BW1Q

Screenshot:

Transcript:

I'd heard that GPT-4's image analysis feature wasn't available to the public because it could be used to break Captcha.

Turns out it's true: The new Bing can break captcha, despite saying it won't: (image)

141
 
 

This is a fascinating discussion of the relationship between goals and intelligence from an AI safety perspective.

I asked my trusty friend GPT-4 to summarize the video (I downloaded the subtitles and fed them into ChatGPT), but I highly recommend just watching the entire thing if you have the time.

Summary by GPT-4:

Introduction:

  • The video aims to respond to some misconceptions about the Orthogonality Thesis in Artificial General Intelligence (AGI) safety.
  • This arises from a thought experiment where an AGI has a simple goal of collecting stamps, which could cause problems due to unintended consequences.

Understanding 'Is' and 'Ought' Statements (Hume's Guillotine):

  • The video describes the concept of 'Is' and 'Ought' statements. 'Is' statements are about how the world is or will be, while 'Ought' statements are about how the world should be or what we want.
  • Hume's Guillotine suggests that you can never derive an 'Ought' statement using only 'Is' statements. To derive an 'Ought' statement, you need at least one other 'Ought' statement.

Defining Intelligence:

  • Intelligence in AGI systems refers to the ability to take actions in the world to achieve their goals or maximize their utility functions.
  • This involves having or building an accurate model of reality, using it to make predictions, and choosing the best possible actions.
  • These actions are determined by the system's goals, which are 'Ought' statements.

Are Goals Stupid?

  • Some commenters suggested that single-mindedly pursuing one goal (like stamp collecting) is unintelligent.
  • However, this only seems unintelligent from a human perspective with different goals.
  • Intelligence is separate from goals; it is the ability to reason about the world to achieve these goals, whatever they may be.

Can AGIs Choose Their Own Goals?

  • The video suggests that while AGIs can choose their own instrumental goals, changing terminal goals is rare and generally undesirable.
  • Terminal goals can't be considered "stupid", as they can't be judged against anything. They're simply the goals the system has.

Can AGIs Reason About Morality?

  • While a superintelligent AGI could understand human morality, it doesn't mean it would act according to it.
  • Its actions are determined by its terminal goals, not its understanding of human ethics.

The Orthogonality Thesis:

  • The Orthogonality Thesis suggests that any level of intelligence is compatible with any set of goals.
  • The level of intelligence is about effectiveness at answering 'Is' questions, and goals are about 'Ought' questions.
  • Therefore, it's possible to create a powerful intelligence that will pursue any specified goal.
  • The level of an agent's intelligence doesn't determine its goals and vice versa.
142
 
 

cross-posted from: https://lemmy.fmhy.ml/post/125116

The new wave of AI systems, ChatGPT and its more powerful successors, exhibit extraordinary capabilities across a broad swath of domains. In light of this, we discuss whether artificial INTELLIGENCE has arrived.

Paper available here: https://arxiv.org/abs/2303.12712 Video recorded at MIT on March 22nd, 2023

143
 
 

TL;DR (by GPT-4 🤖):

  • Use of AI Tools: The author routinely uses GPT-4 to answer casual and vaguely phrased questions, draft complex documents, and provide emotional support. GPT-4 can serve as a compassionate listener, an enthusiastic sounding board, a creative muse, a translator or teacher, or a devil’s advocate.

  • Large Language Models (LLM) and Expertise: LLMs can often persuasively mimic correct expert responses in a given knowledge domain, such as research mathematics. However, the responses often consist of nonsense when inspected closely. The author suggests that both humans and AI need to develop skills to analyze this new type of text.

  • AI in Mathematical Research: The author believes that the 2023-level AI can already generate suggestive hints and promising leads to a working mathematician and participate actively in the decision-making process. With the integration of tools such as formal proof verifiers, internet search, and symbolic math packages, the author expects that 2026-level AI, when used properly, will be a trustworthy co-author in mathematical research, and in many other fields as well.

  • Impact on Human Institutions and Practices: The author raises questions about how existing human institutions and practices will adapt to the rise of AI. For example, how will research journals change their publishing and referencing practices when AI can generate entry-level math papers for graduate students in less than a day? How will our approach to graduate education change? Will we actively encourage and train our students to use these tools?

  • Challenges and Future Expectations: The author acknowledges that we are largely unprepared to address these questions. There will be shocking demonstrations of AI-assisted achievement and courageous experiments to incorporate them into our professional structures. But there will also be embarrassing mistakes, controversies, painful disruptions, heated debates, and hasty decisions. The greatest challenge will be transitioning to a new AI-assisted world as safely, wisely, and equitably as possible.

144
 
 

Original tweet: https://twitter.com/emollick/status/1671528847035056128

Screenshots (from the tweet):

145
 
 

I’ve been following the development of the next Stable Diffusion model, and I’ve seen this approach mentioned.

Seems like this is a way in which AI training is analogous to human learning - we learn quite a lot from fiction, games, simulations and apply this to the real world. I’m sure the same pitfalls apply as well.

146
147
 
 

Quote:

In this work, we introduce TinyStories, a synthetic dataset of short stories that only contain words that a typical 3 to 4-year-olds usually understand, generated by GPT-3.5 and GPT-4. We show that TinyStories can be used to train and evaluate LMs that are much smaller than the state-of-the-art models (below 10 million total parameters), or have much simpler architectures (with only one transformer block), yet still produce fluent and consistent stories with several paragraphs that are diverse and have almost perfect grammar, and demonstrate reasoning capabilities.

Related:

148
 
 

This is the potential development in AI I'm most interested in. So naturally, I tested this when I first used ChatGPT. In classic ChatGPT fashion, when asked to make a directed acyclic graph representing cause and effect, it could interpret that well enough to make a simple graph...but got the cause and effect flow for something as simple as lighting a fire. Haven't tried it again with ChatGPT-4 though.

149
150
 
 

AI isn’t magic, of course, but what this weirdness practically means is that these new tools, which are trained on vast swathes of humanity’s cultural heritage, can often best be wielded by people who have a knowledge of that heritage. To get the AI to do unique things, you need to understand parts of culture more deeply than everyone else using the same AI systems.

view more: ‹ prev next ›