23 Jul 2022

Goal:
- learn machine learning & AI with a specific and personal project
- leverage personal content structured as part of this site (eg movies, musiç )

First steps

Learning Computer Science

Learning AI

Github repos to explore

GitHub - nomic-ai/gpt4all: gpt4all: a chatbot trained on a massive collection of clean assistant data including code, stories and dialogue

github.com

https://github.com/nomic-ai/gpt4all

Fine-tuning a model

27 Aug 2024

OpenAI's GPT4o can now be fine-tuned: https://openai.com/index/gpt-4o-fine-tuning/

Dashboard: https://platform.openai.com/finetune

nicai/240827-gpt-fine-tune-model.jpg

openai.com

https://platform.openai.com/finetune

Create a fine-tuned model
Base Model
Training data
Add a jsonl file to use for training.
Upload new
Select existing
Upload a file or drag and drop here

(.jsonl)

Validation data
Add a jsonl file to use for validation metrics.
Upload new
Select existing
None
Suffix
Add a custom suffix that will be appended to the output model name.
my-experiment
Seed
The seed controls the reproducibility of the job. Passing in the same seed and job parameters should produce the same results, but may differ in rare cases. If a seed is not specified, one will be generated for you.
Random
Configure hyperparameters

Batch size
auto

Learning rate multiplier
auto
In most cases, range of 0.1- 10 is recommended

Number of epochs
auto
In most cases, range of 1- 10 is recommended

Learn more: https://platform.openai.com/docs/guides/fine-tuning

Explore first prompt engineering tactics: https://platform.openai.com/docs/guides/prompt-engineering

OpenAI Assistant

02 Oct 2024

Started testing OpenAI's Assistant API - it's very good.

Migrated NicKalGPT to the OpenAI Playground:

https://platform.openai.com/playground/assistants

and then used it via API as follows:

from openai import OpenAI
client = OpenAI()

import json

ASSISTANT_ID = os.getenv("NIC_KAL_GPT")

def wait_on_run(run, thread):
    while run.status == "queued" or run.status == "in_progress":
        run = client.beta.threads.runs.retrieve(
            thread_id=thread.id,
            run_id=run.id,
        )
        time.sleep(0.5)
    return run


user_prompt = f"""
write the following:

- a cold email, using known frameworks, and in my style
- 3 key questions to ask on a cold call
- a pesonalised Linkedin connection request message

to:

{contact_data}

Write in:
    - German, if contact is in Germany or Austria
    - French, if contact is in France
    - English, otherwise, or if unsure

Never start emails with "I hope this email finds you well" or "I hope you are doing well", or "Quick question". 
Instead, start with a question or a statement that shows you know something about the person or their company.
Use the informal way to refer to the company's name, as if you were talking to a friend (eg "BMW" not "BMW Group").
When using client references from Kaltura, choose relevant ones that are likely to be known by the recipient, so either in the same industry or well-known Enterprises.

Only return the content of the email, including Subject line, but do not add any extra comments to your answer.

Return as JSON following this format:

{
    "email_subject": "Subject Line",
    "email_body": "Email Body",
    "questions": ["Question 1", "Question 2", "Question 3"],
    "linkedin": "Message",
}

Do not return anything else, as I will parse your response programmatically.
Remove any Markdown formatting from your response.
"""

thread = client.beta.threads.create()

message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content=user_prompt,
)

run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=ASSISTANT_ID,
)

run = wait_on_run(run, thread)

messages = client.beta.threads.messages.list(thread_id=thread.id)

message = messages.data[0].content[0].text.value

print("\n\nmessage:")
print()
print(message)

print("\n\n")

# Parse the JSON data
data = json.loads(message)

# Accessing individual fields
email_subject = data["email_subject"]
email_body = data["email_body"]
questions = data["questions"]
linkedin_message = data["linkedin"]

# Printing extracted data
# print(f"\n\nOutbound Suggestions for {x.first} {x.last} at {x.company} located in {x.country}:\n")
print("Email Subject:\n", email_subject)
print("\nEmail Body:\n", email_body)
print("\n\nQuestions for cold call:")
for question in questions:
    print("-", question)
print("\n\nLinkedIn Connect Message:\n", linkedin_message)

03 Oct 2024

Playground: https://platform.openai.com/playground/assistants

Structured output doesn't seem to work with "File Search" on (vector database)? 🤔
https://platform.openai.com/docs/guides/structured-outputs/introduction

Assistants API - Why is JSON mode not available when using file search / code interpreter? - #8 by andrea.bertoncini - API - OpenAI Developer Forum

openai.com

https://community.openai.com/t/assistants-api-why-is-json-mode-not-available-when-using-file-search-code-interpreter/743449/8

Prompts

"You will find below some information.
Leverage the knowledge provided, but try to think beyond what has been shared and list all the use cases CompanyX could use Kaltura for.
Write these use cases and benefits in my style.
Provide the answer in Markdown format, in a code block, without using headers nore bold font."

Resources

Created an AI Chatbot using my newsletter content.

Ask a Q and it will respond with a short answer + links.

Personalised Chatbots are super interesting for content creators:

• Help audience find content
• Identify topics audience wants to know about pic.twitter.com/K9sE6N2ANb
— ntkris (@ntkris) January 12, 2023

Lenny Chatbot

google.com

https://colab.research.google.com/drive/1p2AablavDkSXly6H-XNLoSylMtoz7NDG?usp=sharing#scrollTo=1N5jdPr9P7SZ

I built a Lenny chatbot using GPT-3. Here’s how to build your own.

lennysnewsletter.com

https://www.lennysnewsletter.com/p/i-built-a-lenny-chatbot-using-gpt

Welcome to GPT Index! — GPT Index documentation

readthedocs.io

https://gpt-index.readthedocs.io/en/latest/index.html

Build Your Own ChatBOT - CustomGPT

customgpt.ai

https://customgpt.ai/

Why write emails when you can train ChatGPT to do it.

Here's how to train chatGPT to learn your writing style, so you never have to write an email again:

(written in your tone)
— Rowan Cheung (@rowancheung) February 15, 2023