The crazy and frustrating GenAI journey

AI is a brand new field that emotionally gets people really excited or really scared because people fear corruption and what bad actors may do with the power that comes with it. Some fear it will take their jobs, while others invest tons of money into hoping it will bring a huge ROI. I'm not convinced either of those are true. From my limited experience developing a couple of chatbots backed by Amazon Bedrock, this isn't an exact science. I don't know what frustrates a developer more than outputs not matching what the input would leave you to believe.

One of the most frustrating aspects I've experienced creating a Chatbot front end and then hooking it up to send messages to an LLM like the ones provided by Anthropic, which are great, but if you don't pass the LLM a good prompt, responses may be so bad you may begin to think about throwing things across the room. My experience working with AI has triggered many aggravating moments that have caused me to wonder if any of this is even worth it. At this point I honestly don't know yet for myself is AI is all it's cracked up to be. Is it the future or is it a fad?

Prompt Engineering

The best explanation I've heard is the LLM is really smart with tons of knowledge, but it's a child that needs very direct instructions or you will end up way off course. I went to chatgpt.com and asked the following question, "can you help me define prompt engineering as it relates to GenAI?" This is the answer I received:

Definition of Prompt Engineering in GenAI

Prompt engineering is the process of crafting, optimizing, and iteratively refining the inputs (or "prompts") provided to a generative AI model to achieve desired outputs. It involves understanding the capabilities and constraints of the model, designing effective queries, and systematically experimenting with variations to elicit specific, high-quality responses.

Key Elements of Prompt Engineering

Clarity: Ensuring the prompt is concise and unambiguous to guide the AI model effectively.
Context: Providing the necessary background or framing so the model understands the scope of the task.
Structure: Organizing the prompt in a way that aligns with the model's training, such as using instructions, examples, or a question-answer format.
Iteration: Testing and adjusting the prompt based on the outputs to improve accuracy and relevance.

Importance in GenAI

Control Output Quality: By carefully crafting prompts, users can influence the tone, specificity, and structure of the generated responses.
Reduce Errors: Effective prompt engineering can minimize irrelevant or inaccurate outputs.
Maximize Model Utility: Leveraging the model's capabilities by designing prompts that play to its strengths.

Examples in Practice

Query Refinement:
- Simple prompt: "Summarize the article."
- Refined prompt: "Summarize the key points of the article in bullet points, focusing on recent advancements in AI technology."
Task Specification:
- Simple prompt: "Generate a story."
- Refined prompt: "Write a short story about a futuristic city where AI governs daily life, focusing on the ethical dilemmas faced by the inhabitants."
Using Context:
- Simple prompt: "What is Bedrock?"
- Refined prompt: "Explain Amazon Bedrock as a service for generative AI, including its key features and applications."

I have found the outlined examples and process above are essential. If you can't explain or don't explain exactly what you expect to have returned to you, there's no way an LLM will be able to read your mind and give you what you want.

Connecting to Amazon Bedrock

I'm talking about Amazon Bedrock here because this is what we are using at work and thus where my experience with working with GenAI is involved. When you are being to use Amazon Bedrock you have to know what your architectural plan is moving forward or you will head down the wrong path for miles and miles before you know what that anything has gone awry. If you are a Python developer you find great examples for how to connect to Bedrock and do all the things you want to do. Most examples use Python so you can copy and paste examples and modify where you need to. I'm a JavaScript/Node.js developer so things aren't quite the same and examples are far from good. As of the day of writing this blog post, I haven't found any good Node.js examples for doing any of this. Everything I've accomplished, I've done by reading confusing documentation with some trial and error.

I just want to walk through a couple of approaches with examples that represent the kinds of things I've done.

ConverseCommand

The ConverseCommand is a function from the NPM library @aws-sdk/client-bedrock-runtime and is used for basic conversational interactions with an LLM provided by Amazon Bedrock. This library will not keep track of your session for you like chatgpt. User and session data need to be managed by you. A common way to do this is using something like DynamoDB tables to store messages. I have handled this by creating a session ID by a client side application and passed to it to an AWS Lambda for processing with Bedrock and DynamoDB.

This is an example for how to do this library in a Lambda. For further information please see the ConverseCommand documentation.

const { BedrockRuntimeClient, ConverseCommand } = require('@aws-sdk/client-bedrock-runtime')
const REGION = process.env.REGION

const client = new BedrockRuntimeClient({ region: REGION })

const converse = async () => {
    ;``
    const input = {
        modelId: 'your-llm-id', // You will find this in Amazon Bedrock after the desired one is enabled
        messages: [
            {
                role: 'user',
                content: [{ text: 'What is the capital of France' }]
            }
        ],
        inferenceConfig: {
            maxTokens: 4096, //The maximum number of tokens to allow in the generated response.
            temperature: 0.5, // This indicates how "creative" the LLM is allowed to be, between 0 and 1.
            topP: 1 // The percentage of most-likely candidates that the model considers for the next token.
        }
    }

    try {
        const response = await client.send(new ConverseCommand(input))
        const content = response.output?.message?.content[0]
        return content.text
    } catch (err) {
        console.error(err)
    }
}

This example assumes you have another function that handles an event which calls this one. If you want to save the interaction in DynamoDB you could do something like the following:

const { DynamoDBClient } = require('@aws-sdk/client-dynamodb')
const { DynamoDBDocument, PutCommand } = require('@aws-sdk/lib-dynamodb')
const ddbClient = new DynamoDBClient()
const ddbDoc = DynamoDBDocument.from(ddbClient)
const TABLE = process.env.TABLE

const create = async (record) => {
    const putParams = {
        TableName: Table,
        Item: record,
        ConditionExpression: 'attribute_not_exists(recordId)'
    }

    try {
        await ddbDoc.send(new PutCommand(putParams))
    } catch (err) {
        console.error(err)
    }
}

What this will allow you do to is fetch the history and saved in DynamoDB and send that as part of your prompt with subsequent calls to Amazon Bedrock. You need to keep in mind that depending on the size of the data being sent to the LLM, the call to Bedrock can fail if that attached history grows too large. Also please note that these are not all the steps necessary to making a chatbot fully functioning. These are some merely some basic examples for how to make calls to get results.

RetrieveAndGenerateCommand

The RetrieveAndGenerateCommand is designed to use an Amazon Bedrock knowledge base to get specific information from the LLM. I'm not going to go into everything here about Bedrock knowledge bases, but suffice it to say that it is a way to store organization specific information that an LLM can be instructed to use when generating responses.

In my organization we built an internal developer portal. That developer portal was built using NextJS backed with Contentstack as our headless CMS. We use GitLab for our code source control and utililize the pipline capabilities provided by GitLab. We produced a library to be used in our pipelines so that when code was checked in an the pipline triggered, certain data like versions would update entries in Contentstack. Our developer portal would then build and pull in all of the data around our code projects and display them for our internal users.

A big problem we ran into was the ability to search through all of that data to find answers to our questions for things like "how do I call a particular API?" or "who owns a particular project?" Our solution to this was to add a GenAI chatbot to the developer portal using a knowledge base full of the same data as what is in the developer portal. Amazon Bedrock provides the ability to point a knowledge base at an S3 bucket. That was our approach. I created a project that would pull all of our data out of Contentstack and process it into Markdown along with some metadata files. These generated files are then uploaded to S3 where the knowledge base can utililize the information. The prompt send to Amazon Bedrock then instructs the LLM that the only place it should look for answers is in that knowledge base. If it can't find the answer it tells the user that.

Here is a code example for how the RetrieveAndGenerateCommand can be used. It is important to note that the RetrieveAndGenerateCommand will return an Amazon Bedrock sessionId that you should use on subsequent calls. By default Bedrock sessions are good for 24 hours.

const { BedrockAgentRuntimeClient, RetrieveAndGenerateCommand } = require('@aws-sdk/client-bedrock-agent-runtime')

const REGION = process.env.REGION

const client = new BedrockAgentRuntimeClient({ region: REGION })
const retrieveAndGenerate = async () => {
    const input= {
        input: {
            text: 'your-message'
        },
        sessionId: 'your-session-id',
        retrieveAndGenerateConfiguration: {
            type: 'KNOWLEDGE_BASE',
            knowledgeBaseConfiguration: {
                knowledgeBaseId: 'id-of-knowledge-base'
                modelArn: 'your-model-arn',
                generationConfiguration: {
                    promptTemplate: {
                        textPromptTemplate: 'your-template-for-the-LLM-to-use'
                    },
                    inferenceConfig: {
                        textInferenceConfig: {
                            topP: 1,
                            temperature: 0, // 0 enforces the idea that the knowledge base has the information
                            maxTokens: 4096
                        }
                    }
                }
            }
        }
    }

    try {
        const command = new RetrieveAndGenerateCommand(input)
        const response = client.send(command)
        const generatedText = response.output?.text
        const sessionId = response.sessionId
        return { message: generatedText, sessionId }
    } catch (err) {
        console.error(err)
    }
}

Conclusions

This is not an exhaustive example for how to use Amazon Bedrock to set up a chatbot nor does it resolve all the questions a developer may want or need answered. My intent was to merely introduce the topic and show examples for how I have interacted with Bedrock and the use cases for both. I have spent my whole career learning as I go and working with GenAI is no exception. I began my career as an HTML and CSS developer. Over the years I have grown into a role where GenAI is now my focus and role developing it front to back. By no means do I consider myself an expert or even great at this yet. I'm learning as I go, and the go is slow. So if you happen to read this, practice patience and keep plugging away. I don't know if there will be an ROI in the end, but learning new things should be fun especially when it's bleeding edge for most of us.

GenAI and Amazon Bedrock