This course is currently under maintenance. We're updating content and labs. Enrollment is paused; check back soon.
AI Instructor Live Labs Included

GCP: Google Cloud AI APIs

Use Google Cloud's pre-trained AI APIs — Vision, Natural Language, Speech-to-Text, and Translation — with simple Python REST calls. No ML knowledge required.

Beginner
9h 45m
6 Lessons

About This Course

Learn to use Google Cloud's suite of pre-trained AI APIs Vision, Natural Language, Speech-to-Text, and Translation using simple Python REST calls. No machine learning knowledge required. You'll make real API calls, build individual wrappers for each service, and combine them into a multi-API content analysis pipeline.

Course Curriculum

6 Lessons
01
AI Lesson
AI Lesson

GCP: Introduction to Google Cloud AI APIs

1h 0m

Discover Google Cloud's suite of pre-trained AI APIs — Vision, Natural Language, Speech-to-Text, and Translation. Learn what each API does, when to use it versus a custom model, how REST calls are structured, and the authentication and quota model.

02
Lab Exercise
Lab Exercise

GCP: Vision and Natural Language APIs in Python

2h 5m 4 Exercises

Make real API calls to the Cloud Vision and Natural Language APIs using Python requests. You'll detect labels and extract text from images, analyze sentiment and entities in text, and combine both APIs to build an image content analyzer.

Detect Labels and Extract Text with the Vision API Make your first Cloud Vision API calls using Python requests. You'll detect labels in an image, extract text with OCR, and learn how to parse the nested JSON response structure. ~20 min
Analyze Sentiment and Extract Entities with the Natural Language API Call the Natural Language API to analyze the sentiment of product reviews and extract named entities from news text. Learn how to interpret score vs. magnitude and sort entities by salience. ~20 min
Detect Language and Translate Text Use the Translation API to detect the language of text samples and translate them to English. Learn the difference between standalone detection and translation-with-detection, and handle already-English inputs efficiently. ~15 min
Combine APIs — Analyze Image Content End to End Chain the Vision and Natural Language APIs together: extract labels and OCR text from an image, detect the text language, translate if needed, then analyze sentiment. Build a function that returns a complete content analysis report. ~25 min
03
AI Lesson
AI Lesson

GCP: Speech-to-Text and Translation APIs

1h 0m

Deep dive into the Speech-to-Text and Translation APIs — how audio encoding works, transcription confidence scores, speaker diarization, language detection, and batch translation. Understand when to use each API tier and how to handle multi-language content.

04
Lab Exercise
Lab Exercise

GCP: Building a Multi-API Content Analyzer

2h 30m 4 Exercises

Build a complete content analysis pipeline that combines the Vision, Natural Language, Speech-to-Text, and Translation APIs. You'll implement individual API wrappers, chain them together, and create a unified analyzer that handles multilingual image and audio content.

Build the VisionAPIClient Wrapper Implement a clean VisionAPIClient class wrapping all Vision API features: label detection, OCR, object localization, and safe search. The class handles authentication and request building so callers only deal with parsed results. ~25 min
Build the NaturalLanguageClient Wrapper Implement NaturalLanguageClient covering sentiment analysis (with per-sentence breakdown), entity extraction (with Wikipedia URLs), and content classification. All methods share a common _post() helper. ~25 min
Implement Speech-to-Text and Translation Wrappers Implement transcribe_audio_url() using base64-encoded audio and the synchronous Speech-to-Text endpoint. Also build detect_language() and translate_to_english() with a short-circuit for already-English text. ~25 min
Build the ContentAnalyzer Multi-API Pipeline Assemble all three API wrappers into ContentAnalyzer — a class that takes an image URL, runs Vision (labels, objects, safe search, OCR), Translation (language detect + translate), and Natural Language (sentiment + entities), and returns a structured report with a flagged field for inappropriate content. ~30 min
05
AI Lesson
AI Lesson

GCP: AI Image Generation — Nano Banana 2 and Pro

45m

Learn how Google's Nano Banana image generation models work — Nano Banana 2 (gemini-3.1-flash-image-preview) for fast, high-volume generation and Nano Banana Pro (gemini-3-pro-image-preview) for maximum quality with built-in thinking. Understand the image generation API, response_modalities, ImageConfig parameters, and when to choose each model.

06
Lab Exercise
Lab Exercise

GCP: Hands-On Image Generation with Nano Banana

2h 25m

Use the Gemini image generation API in VS Code to generate images from text prompts, edit reference images, build a multi-turn image editing session, and compare the quality output of Nano Banana 2 versus Nano Banana Pro. All exercises use the new google-genai SDK.

Under Maintenance

This course is currently being updated. Check back soon!

This course includes:

  • 24/7 AI Instructor Support
  • Live Lab Environments
  • 3 Hands-on Lessons
Skill Level Beginner
Total Duration 9h 45m