10 things you need to know to make a perfect screencast

Are there 10 things? No, I am just kidding!

Are you a frustrated instructional designer looking for a perfect tool for your tutorial video? Have you ever had an existential crisis on what it really means to make a screencast video? You have come to the right place.

In this article, I will try my best to discuss issues ranging from what to use, how to use and why to use? I will discuss things such as the underlying philosophies of what screencast can do in the context of technology training, a brief overview of the currently available tools on the market, and finally, my workflow in making a screencast, if that’s of any value to you.

Where does the idea come from?

Screen capture has always been a handy tool since the advent of computer.

A bit of clarification first; and history.

When we talk about screen capture we may refer to the act of:

  1. Producing a “hardcopy” of the text based terminal in front of you by pressing the “PrtSc” button on your keyboard (a key persists to this day!)
  2. Capturing a static image of your whole desktop.
  3. Making a video file of what you see on your screen, including your mouse movement, and sometimes a narration.

It is obviously the third kind that we are primarily concerned with here.

I have yet to see a computer historian to do the dirty work, but to my knowledge, the first tool that can capture your screen in a video format came in the 1990s. There was a product, by the now defunct Lotus company, called ScreenCam. A May 9 1994 InfoWorld article describes it as:

Launching ScreenCam displays a small, VCR-like control panel, with Record and Stop buttons. When you are recording, everything you do in Windows is captured into an animated movie file that precisely record your actions.

Sounds familiar? Yes, it is very much the same thing we do these days. We are still stuck with the VCR metaphor.

One important feature of the ScreenCam software is that it records Windows events instead of every single frames that are displayed on the screen. This is important. It therefore begs for a fourth definition of the term screen capture.

4. Making a series of static images (screen shots), complemented by human interactions (mouse movements and keyboard clicks) between them.

This notion of screencast serves the purpose of, in the case of ScreenCam, reducing the computational power it needed. But today, when that is no longer a concern, it serves the purpose of breaking down a chain of actions into smaller units which can be precisely defined and therefore lend itself to a mode of technology training called simulation. More on this later.

A recording of your screen is not a screencast

Right now let us stay with the third definition: you produce a video file of what happens on your screen.

However, simply record your screen is not enough to bring your audience’s attention to where they should be in the elearning context. Certain features of the tool get fine-tuned. Hence the the suite of what I call “special effects” that are increasingly regarded as essential in addition to the simple and straightforward screencast. These effects include but is not limited to:

  • zooming in and out to highlight certain area of screen.
  • panning movement to highlight how the mouse is moved or how one area is connected to another.
  • magnifying the mouse (or changing it into some other shape)and its trace to make it more discernible.
  • using markers (text and icons) to explain a specific thing on screen.
  • using exaggerated clicking sound to imitate mouse clicking or keyboard input.
  • using animations (flashes) to imitate mouse clicking.
  • sectioning by lower-thirds such as caption for steps, chapters etc.
  • branding with logo or watermark.
  • slowing down: this feature is available by manipulating the timeline?

If we are deal with touch screen here, you need to visually indicate gestures in a way that is self-explanatory: here is a tap, here is a swipe, here is a pinch, etc.

More importantly, the role that a screencast video may play needs to be contextualized. Many people, when thinking about the screencast video, tend to think of it as a standalone resource, as if the only thing we need to do is to make such a video and throw it out there. Technically it is true, and it certainly reflects the humble origin of the tool, as well as the whole experience characterized by convenience and complete passivity, popularized by Youtube.

But as recent research finds out, video is not the solution to everything. People still prefer to read when it is more efficient way of gathering information. Seriously, if you have been watching tutorial videos, count the time you wasted listening to all sorts of nonsense!

Even if video were superior (in some cases) to words, what is even more superior than video is one that invites active participation. This could be the future of Youtube. And Facebook is actively seeking to incorporate video as a community based, collaborative, and highly interactive feature.

Workflow Dilemmas

To record a screencast, the first thing we need to think about is if we need a camera feed. In other words, do you want a talking head, sometimes taking the whole screen and sometimes exists as PIP? The personal touch is useful under certain circumstances but generally not needed in instructional design cases because here the industrial convention is that ID is largely an anonymous figure who works behind the stage.

The second thing of concern is the voiceover. Many people record their screen and just throw some text on the screen, or use a background music track. This shows how intimidating the task of creating voice over can be. It is something that requires careful planning, skillful execution, reasonably competent hardware (which we have to purchase), and often painful post-production.

The final piece of the puzzle is the subtitle. If you are a Youtuber who just throw your video out there then this is not your concern at all. But for many organizations producing video requires ADA compliant subtitles.

Looking at the three pieces (video, voice, subtitle) together, we ask: in what order do we produce these? The answer is often: it depends.

If a screencast is produced in a professional scenario, you often need a script written and approved in advance. You could use this script to make the video first, and then add narration. The order could well be: subtitle, video, voice.

I have also heard that some prefer to record the narration first, and then try to play it while recording the video. This is equal to “do what is said”. Sounds easy, but you do need to manipulate the video quite a bit if the system is not up to your pace.

On the other hand, one could argue that this is precisely the reason to do narration first. If you are recording while looking at the video, how do you manage to look at the script?

Either way the fact that we have a script creates a dilemma we cannot completely obliterate: reading from a written script would sound unnatural. But improvising when recording the video, or when watching the video, that is not an acceptable option either.

Choices of Tools

Even in the context of elearning, there are many tools that can deal with screencast. The problem is: these tools overlap in terms of features and it is not always easy to figure out a workflow that fits your scenario. For this purpose I have divided the tools into three separate categories:

Prototyping tools: rapid and hassle-free recording.

  • Peek: available both on Windows and Mac. It records and uploads immediately so you can share it with somebody else. No editing.
  • SnagIt: a tool by TechSmith that is far superior than anything in its range. It captures image and video. It can record panoramic or scrolling image. It can be basic video trimming. It can produce gif. It can do simple markup. It even has an asset management system which is really useful if you are making tons of captures. This is an essential tool.

Specialized tools: these tools specialize in their tasks and strike a balance point in terms of feature and easy of use.

  • Replay (a standalone product in Articulate 360): Windows only; a recording tool that can take three tracks (screen, webcam, lower thirds) and conveniently switch between them, or use PIP. Basic Timeline editing is provided.
  • Camtasia: one of the most well-known tools of the trade. It is positioned as a video editing tool. But it is very different from the general purpose video editor such as Premier, Finalcut or iMovie. It really addresses the needs of such productions.
  • ScreenFlow: native Mac app that is similar to Camtasia. But after I tried it I found even the Mac version of Camtasia (which everyone knows is a little brother to the Windows version) is far superior.

Heavy-weight tools: these are not screencast software per se but have the functionalities built-in. In these tools, the philosophy of making a screencast is that, since working with in-slide interactions already involves sophisticated manipulation of timeline, zooms and pans, as well as markers, are just adding other elements to the timeline. It should involve minimal additional learning for designers.

  • Storyline (Windows only): the screen recording feature inside Storyline gives you some extra features compared to Replay, which are essential in some cases: 
    * record system sounds
    * record the video as step by step slides and later, as simulations.
    * Move new windows: this is designed to deal with multiple windows interaction.
    * Zooming and panning.
    * Add markers to particular point on screen and timeline.
    * Support subtitle.
  • Adobe Captivate: achieves a similar set of functionalities.

It is worth mentioning that for these two elearning authoring tools, screen recording is basically understood as adding something to its slide timeline. There already is a timeline! Therefore you don’t need to learn new things.

This philosophy makes perfect sense, assuming that we are talking about people who are already deeply entrenched in slide interactions building. But there is a large percentage of IDs who are not. In fact, slide-based elearning is not something we take for granted now. This may come as a shock to some, who would instantly see elearning as such, just as people would think powerpoint when they think presentation.

But for the topic at hand, the biggest difference is that, in the context of a slide, a video is just one element on the timeline. By adding other elements to the timeline we can work “within” the video, to show things or change the video itself according to timecode. In the webpage-based format, however, the video remains a non-divisible unit. It is embedded in a web page. While we can access randomly the video from any point in time, all other activities on the web page are suspended, as they exist essentially in a timeless fashion.

To be continued…

Leave a Reply