Captions and Transcripts

The University of Toronto is committed to the principles of the Accessibility for Ontarians with Disabilities Act (AODA). According to the Ontario Regulation 191/11, section 14: By January 1, 2021, all internet websites and web content must conform with WCAG 2.0 Level AA, other than, success criteria 1.2.4 Captions (Live), and success criteria 1.2.5 Audio Descriptions (Pre-recorded).

Universal Design for Learning (UDL)

Audio descriptions, captions, subtitles, and transcripts align with the UDL guideline of providing multiple means of representation. Captions and transcripts benefit all users, including non-native speakers, viewers watching videos on low bandwidth or in noisy/quiet environments, and students learning new terminology can use the captions and transcripts to improve comprehension, information processing, and retention.

Descriptions, Captions, Subtitles, and Transcripts

What are the differences between descriptions, captions, subtitles, and transcripts?

Audio Descriptions

  • Narrations that describe visual information needed to understand the content
  • Narrations inform those who cannot see the video


  • Text versions of speech and other important audio content synchronized to the visual and auditory content
  • The most common type is “Closed Captions,” which can be turned on or off via the “CC” button on video players
  • “Open Captions” are part of the video stream and cannot be turned off
  • “Live captioning” (or “live closed captioning”) refers to the simultaneous creation and display of captions for live events and videos


  • Text translations of speech and audio content


  • Text versions of speech and descriptions of important audio and visual information
  • Transcripts do not present any information about the timing
  • Transcripts offer the option of reading text if the web audio or video content is inaccessible

For more information, refer to the World Wide Web Consortium (W3C)’s pages on Captions/Subtitles, Transcripts, and Descriptions.

Automatic Live-Captioning Tools

Real-time automatic subtitles depend on a cloud-based service, which requires a fast and reliable internet connection. Work in a space with minimal background noise and check that the microphone is plugged in and unused by another application. Speak clearly and steadily to increase accuracy.

By contrast, real-time captioning requires a skilled transcriber. No teaching assistant or student should bear the responsibility of providing accurate captioning in real time.

Microsoft 365 PowerPoint can generate live captions and subtitles (for a single presenter) in several languages. Captions and subtitles are not saved.

  1. Log in to your online Outlook/UTmail+ account and click on the waffle in the top left corner. Click “All apps.” Select PowerPoint.
  2. Navigate to the “Slide Show” ribbon, select the “Always Use Subtitles” option, and choose where you would like the subtitles to appear.
  3. Select the “Spoken Language” and the “Subtitle Language.”
  4. Begin your presentation.
  5. If the real-time automatic subtitle feature does not turn on while you are presenting, click on the “Toggle Subtitles” button (which looks like a rectangle with dashes near the bottom) on the toolbar below the slide.

Microsoft Teams can provide live captions (available only in U.S. English) during meetings.

  1. Go to “Meeting Controls,” click on the three dots (or ellipsis) menu for “More actions.”
  2. Select “Turn on live captions.”
  3. The new meeting experience attributes speakers to captions.

Zoom* can provide live captions (available only in U.S. English) during meetings. Zoom can also automatically transcribe meetings recorded to the cloud. The transcript appears as a separate .vtt file.

  1. Sign in to the Zoom web portal, and in the navigation panel, go to “Settings.”
  2. Scroll down to “Closed captioning” and turn on closed captioning (the slider will move to the right and become blue).
  3. Check the box for “Enable live-transcription service to show transcript on the side panel in-meeting.
  4. As the Zoom meeting host, select the “Live Transcript” button from the Zoom control bar.
  5. When the Live Transcription menu opens, click “Enable Auto-Transcription.”
  6. Remind participants to select “Show Subtitles” from the “Live Transcript” menu.

* There is no central office at the University of Toronto that provides tech support for Zoom, which is only provisionally supported. Instructors and teaching assistants using Zoom should confirm support with their department and division.

Automatic Captioning Tools

The most common caption file formats are .srt. and .vtt. There are several differences between the two file formats, and the choice will depend on the video hosting platform. For example, MyMedia uses .vtt format. The tools listed below generate automatic caption files.

Generating your own .vtt file from scratch can be tricky. Visit W3C’s page on WebVTT: The Web Video Text Tracks Format for guidance.

Microsoft Stream, a part of Office 365, is a secure video service for uploading, viewing, and sharing videos.

  1. Log in to your online Outlook/UTmail+ account and click on the waffle in the top left corner. Click “All apps.” Select the Stream tool.
  2. Click on “+ Create” to upload video. Stream supports many file formats including .mp4, .avi, .flv, .mkv, .mov, .wav, .wmv.
  3. Set the “Video Language” to automatically generate a caption file.
  4. Set “Permissions.”
  5. In “Options,” check the box for “Autogenerate captions.”
  6. To check and edit the captions, click on “My content” and click on the video to watch.
  7. The transcript box will appear to the right of the video. Click on the pen icon to edit the transcript. Tip: insert punctuation to facilitate clarity and ease of reading.
  8. To download the captions, click on the three-dot (or ellipsis) menu and select “Update video details.”
  9. Under the “Options” column, located on the far right, click on “Download file” next to the word “Captions.” The file will save in .vtt format.

Videos in Microsoft Stream are searchable by the University of Toronto community. MyMedia allows sharing with privacy. Who you want to view your video content will determine where your videos are hosted.

YouTube* Studio allows you to manage your YouTube videos.

  1. Log in to your YouTube Studio using your Gmail account.
  2. Upload the video, set the language, and set the video’s visibility.
  3. Once the video has been uploaded, click on the pen icon to review the video’s details. From the left menu, select “Subtitles.” YouTube automatic captioning can take a while. Budget time for this task.
  4. To check and edit the captions, click on the three-dot (or ellipsis) menu next to the word “Published” and select “Edit on Classic Studio.” Once completed, click on the “Return to YouTube Studio” button in the top right corner.
  5. To download the captions, find the video in YouTube Studio, and click on the pen icon to review the video’s details.
  6. From the left menu, select “Subtitles.” Click on the three dots (or ellipsis) menu next to the word “Published” and select “Download.” You can download the captions in .vtt, .srt., or .sbv formats. Sometimes YouTube will not automatically generate captions due to audio complexity and video length.

* YouTube is blocked in various countries. Consider hosting videos via MyMedia to provide a secure and accessible space for all University of Toronto learners.

MyMedia is an archival storage and streaming solution for University academic media content. MyMedia does not generate auto-captions but allows for uploading captions and sharing with privacy.

  1. Access MyMedia. You will be asked to log in with your UTORid.
  2. Click on “New Upload” to upload a video. MyMedia supports most libavcodec video and audio formats.
  3. To add captions to your video, click on the pen icon and select the “Tracks” tab.
  4. Click on “Upload New Track,” then “Choose File,” and select type and language for the .vtt file generated from
  5. Microsoft Stream. Tip: select “Captions” as MyMedia can use the same .vtt file for closed captioning and transcripts.
  6. Click on the video to check if a “CC” button is available in the bottom right corner.
  7. Click on “Transcript” to view the text next to the video.
  8. To edit the captions, return to your dashboard by clicking on the “MyMedia” in the upper left corner. Select the pen icon and select the “Tracks” tab. Download the .vtt file and re-upload when editing is complete.

Automatic Transcription Tools

Microsoft 365 Word can generate English transcriptions by recording audio directly in Word or uploading an audio file.

  1. Log in to your online Outlook/UTmail+ account and click on the waffle in the top left corner. Click “All apps.” Select Word.
  2. To generate a transcription by recording audio directly into Word, go to “Home,” select “Dictate,” and “Transcribe.”
  3. Click “Start recording.” Give the browser permission to use your mic.
  4. Wait for the pause icon to be outlined in blue and start talking.
  5. When finished, select “Save and transcribe now.” The transcription may take a while. is an application that generates speech-to-text transcriptions using artificial intelligence and machine learning. Otter is useful for generating transcripts for audio clips or for videos that have no captions. Otter is accessible on the web and via a mobile app. The free plan allows 600 minutes of transcription per month, with a maximum of 40 minutes per sitting.

  1. Create an Otter account and click the blue microphone button to record.
  2. As you speak, Otter picks up the audio. When you are done, click on the stop button.
  3. Otter will process the recording. Once completed, click on the note to view the transcription.
  4. To edit the transcript, click on the pen icon.
  5. To save the transcript, click on the three dots (or ellipsis) menu and select “Export,” then “Export text.”
  6. The transcript can be exported as plain text or a .txt file under the free plan.

Further Resources

CAST. (2021). The UDL Guidelines.

Karapita, M., ed. (2017). Inclusive Language in Media: A Canadian Style Guide.

W3C Web Accessibility Initiative (WAI). (2021). Making Audio and Video Media Accessible.

WebAIM. (2020). Captions, Transcripts, and Audio Descriptions.

Back to Top