TL;DR: I built a serverless dining concierge chatbot that collects your food preferences through conversation and emails you personalized restaurant suggestions. It uses S3, API Gateway, Lambda, Lex, SQS, DynamoDB, OpenSearch, SES, EventBridge, and IAM — all stitched together without a single server to manage. The architecture looks clean on a whiteboard. In practice, I hit session management bugs, API pagination walls, and a billing trap that the assignment itself warned me about.
The Problem
For a cloud computing course, I needed to build something that sounds simple: a chatbot that asks what kind of food you want, then emails you restaurant suggestions. The catch? It had to be fully serverless, use NLP for conversation, and decouple the chat experience from the recommendation engine.
The real challenge isn't any single service — it's making ten of them talk to each other correctly.
The Architecture
User ──▶ S3 (Static Site) ──▶ API Gateway ──▶ LF0 (Lambda)
│
▼
Amazon Lex
│
▼
LF1 (Lambda Code Hook)
│
┌─────┴─────┐
▼ ▼
SQS Queue DynamoDB
│ (User State)
▼
┌──── LF2 (Lambda Worker) ◀── EventBridge (every 1 min)
│ │
▼ ▼
OpenSearch DynamoDB
(Cuisine Index) (Restaurant Data)
│
▼
Amazon SES
(Email Delivery)
Three Lambdas, two databases, a search engine, a message queue, an NLP bot, and an email service. Each one is straightforward on its own. The interesting part is the seams between them.
The Hard Parts
1. The Session ID Bug That Broke Multi-Turn Conversations
This one was subtle. Lex V2 uses a sessionId to maintain conversation state across turns. When a user says "I want Japanese food" and Lex asks "How many people?", Lex needs to know that the next message ("2") is a response to that question — not a brand new conversation.
My first LF0 implementation generated a fresh UUID for every API call:
# lambda-functions/LF0/lambda_function.py — the broken version
session_id = str(uuid.uuid4()) # New session every message!
This meant every single message started a completely new Lex session. The chatbot would ask for your location, you'd say "Manhattan," and Lex would treat "Manhattan" as a brand new utterance — matching it to... nothing useful.
The fix was a two-part change. First, generate a session ID in the frontend that persists for the browser tab's lifetime:
// frontend/assets/js/chat.js
var chatSessionId = 'user-' + Math.random().toString(36).substr(2, 9) + '-' + Date.now();
Then pass it through the API body and have LF0 use it:
# lambda-functions/LF0/lambda_function.py — the fix
session_id = body.get("sessionId", str(uuid.uuid4()))
The lesson: in a stateless architecture, somebody has to own the session. In this case, it's the browser tab.
2. Yelp's Pagination Wall at 240 Results
The assignment required 1,000+ restaurants from Manhattan across at least 5 cuisines. I figured I'd just paginate through Yelp's API — 50 results per page, 200 per cuisine, five cuisines, done.
Except Yelp has an undocumented hard limit: offset + limit must be <= 240. With 50 results per page, the maximum offset is 190. That gives you 240 results per search query at most — but in practice, cross-cuisine overlap (a Japanese-Italian fusion place appearing in both searches) means you get fewer unique results.
My first run with 5 cuisines hit 983 unique restaurants. The fix was trivially adding a 6th cuisine (Thai), but the real insight is about the deduplication strategy:
# other-scripts/yelp_scraper.py
all_restaurants = {} # keyed by BusinessID for dedup
for biz in businesses:
biz_id = biz["id"]
if biz_id not in all_restaurants:
all_restaurants[biz_id] = extract_restaurant_data(biz, cuisine)
By keying on BusinessID and processing cuisines sequentially, a fusion restaurant gets tagged with whichever cuisine found it first. Simple, deterministic, and no duplicates in DynamoDB.
3. Lex V2's Dual Code Hook Model
Amazon Lex V2 invokes your Lambda code hook at two different points, and mixing them up produces confusing behavior:
DialogCodeHookfires after every user turn — for slot validationFulfillmentCodeHookfires once, after all slots are filled — for the final action
The response format is different too. During dialog, you return one of three actions:
# Delegate: "Lex, you handle the next question"
{"sessionState": {"dialogAction": {"type": "Delegate"}, ...}}
# ElicitSlot: "Ask the user to re-enter this specific slot"
{"sessionState": {"dialogAction": {"type": "ElicitSlot", "slotToElicit": "Location"}, ...}}
# Close: "We're done here"
{"sessionState": {"dialogAction": {"type": "Close"}, ...}}
The slot value extraction is also nested deeper than you'd expect. It's not slots["Location"] — it's slots["Location"]["value"]["interpretedValue"]. I wrote a helper to avoid repeating that traversal everywhere:
# lambda-functions/LF1/lambda_function.py
def get_slot_value(slots, slot_name):
slot = slots.get(slot_name)
if slot and slot.get("value"):
return slot["value"].get("interpretedValue")
return None
4. The SQS Decoupling Pattern (and Why It Exists)
The most elegant part of this architecture is the SQS queue sitting between the chatbot (LF1) and the recommendation engine (LF2). The user doesn't wait for restaurant lookups, OpenSearch queries, or email delivery — they get an instant confirmation: "You're all set. Expect my suggestions shortly!"
LF1 (the producer) pushes a compact JSON message to the queue:
{
"Location": "Manhattan",
"Cuisine": "japanese",
"NumberOfPeople": "2",
"DiningTime": "19:00",
"Email": "user@example.com"
}
LF2 (the consumer) is triggered by EventBridge every minute, polls the queue, and does the heavy lifting — querying OpenSearch, enriching from DynamoDB, formatting and sending the email. If LF2 crashes, the message stays in the queue and gets retried automatically. If there's a traffic spike, messages queue up naturally.
This producer-consumer pattern with SQS is the single most useful architectural decision in the project. It turns a synchronous "ask and wait" experience into an asynchronous "ask and get notified" one.
5. The OpenSearch Billing Trap
The assignment spec literally warns you: "It's very important you don't leave this domain on for a long time. It's not serverless, so you will be charged a lot of money."
Unlike every other service in this architecture (Lambda, SQS, DynamoDB, SES — all pay-per-use), OpenSearch runs a dedicated instance that charges by the hour. Even the smallest config (t3.small.search, 1 node) runs about $26/month.
The strategy: write all the code and data loading scripts first, create the OpenSearch domain only when you're ready to test end-to-end, and delete it immediately after submission. I wrote the loader script to be idempotent — it drops and recreates the index each run — so spinning up a fresh domain takes about 15 minutes plus a script execution.
What I'd Do Differently
Use Lex V2's built-in slot validation where possible. I wrote custom validation for every slot in LF1, but Lex V2 actually handles type validation (numbers, emails, dates) natively. My custom validation only adds value for business logic (like restricting location to Manhattan). I could have cut 30% of LF1's code.
Store the session ID in localStorage, not just a JavaScript variable. My current implementation loses the session on page reload. For the extra credit feature (remembering returning users), a localStorage-backed session ID would survive across visits.
Use DynamoDB Streams instead of polling SQS. An alternative architecture would have LF1 write directly to a DynamoDB "requests" table, with a DynamoDB Stream triggering LF2. This eliminates the SQS service entirely and gives you an automatic audit log. But the assignment specifically required SQS, so here we are.
Key Takeaways
-
In serverless architectures, session management is your responsibility. There's no server keeping state between requests. The client must generate and persist the session ID, and every service in the chain must faithfully pass it through.
-
Yelp's API has a hard pagination ceiling at 240 results. If you need more data per query, use multiple search terms or geographical subdivisions. Budget for cross-query deduplication.
-
Decouple real-time interactions from async processing with SQS. The user gets instant feedback while heavy work happens in the background. This pattern scales naturally and handles failures gracefully.
-
Not all "serverless" services are pay-per-use. OpenSearch (and its predecessor, Elasticsearch Service) charges by the hour for running instances. Know which services bill continuously vs. per-request before you architect around them.
-
Lex V2 code hooks have two invocation modes with different response contracts.
DialogCodeHookis for validation (returnDelegate,ElicitSlot, orClose), andFulfillmentCodeHookis for final actions (returnClosewith the confirmation message). Get this wrong and Lex enters an infinite loop. -
Build the entire pipeline before creating expensive resources. Write and test each component in isolation, wire them together with cheap/free services first, and only spin up the expensive pieces (OpenSearch) for final integration testing.
Built for Cloud Computing at NYU, Spring 2026. The full source is at github.com/Sachin1801/dining-concierge-chatbot.