Verbs of Perception: Bare Infinitive vs. Gerund

When we use verbs of perception like see, hear, feel, notice, or watch, the action verb that follows can take two forms: the bare infinitive (without to) or the -ing form. We use the bare infinitive for a completed action ("I saw him drop the envelope") and the -ing form for an action in progress ("I heard something breathing heavily"). It is always incorrect to use a to + infinitive in these structures.

In this challenge, you will practice choosing the correct verb forms to follow verbs of perception. You will help detectives finalize their surveillance reports, complete campers' spooky ghost stories, and finish a wildlife photographer's field notes. Along the way, you will learn when both the bare infinitive and the -ing form are grammatically acceptable depending on the context.

You'll work through 10 questions featuring a mix of single-choice, multi-choice, drag-and-drop, and drop-down formats.

Try the quiz to check your knowledge!

To ChallengesStart Challenge
Question 1

Help the detective record the witness's exact statement about the clumsy thief.

"I watched the thief ___ the diamond and put it in his pocket," the witness testified.

The correct answer is take.

Verbs of perception (like watch, see, hear) are followed by an object and a bare infinitive (the base verb without "to") when you witness the entire completed action.

While the -ing form (taking) can sometimes be used for actions in progress, it doesn't work here because the second action is a bare infinitive ("and put it in his pocket"). The verbs must match!

Question 2

Help the wildlife photographer finish his field notes by dragging the correct verbs to complete the funny story.

I was hiding in the bushes when I noticed a large, fluffy bear trying to sneakily open a camper's picnic basket. A few minutes later, I observed the clumsy animal trip over a fallen log and roll all the way down the hill.

I was hiding in the bushes when I noticed a large, fluffy bear trying to sneakily open a camper's picnic basket.

When you catch someone (or a bear!) in the middle of doing something, use a verb of perception (like notice) followed by the -ing form. The bear was already in the process of trying to open the basket when the photographer noticed.

A few minutes later, I observed the clumsy animal trip over a fallen log and roll all the way down the hill.

For a quick action that you witness from beginning to end, use the bare infinitive. The bear tripped and rolled as one complete, unfortunate sequence of events!

Question 3

Complete the camper's spooky story about a midnight visitor.

I was frozen in my sleeping bag when I heard something ___ heavily right outside my tent.

The correct answer is breathing.

After verbs of perception (like hear, see, feel), we use the -ing form to describe an action that was already in progress when we noticed it.

We never use the "to-infinitive" (to breathe) or a conjugated verb (breathes, was breathing) directly after the object in this structure.

Question 4
Help a frustrated student complete a text message complaining about the noise next door. Select ALL the options that grammatically complete the sentence.
Through the thin walls, I heard my roommate's terrible band ___ the exact same song for three hours!

The correct answers are play and playing.

After verbs of perception (like hear, see, watch, feel), we can use an object followed by either a bare infinitive (play) or an -ing form/gerund (playing).

We never use the to-infinitive (to play) after an active verb of perception!

Note on meaning: Using "play" (bare infinitive) suggests you heard the entire completed action from start to finish. Using "playing" (-ing form) emphasizes the ongoing, continuous nature of the action. Both are completely correct here!

Question 5

Complete the frustrated roommate's text message about a terrible singer.

I can literally feel the floorboards ___ every time Kevin tries to hit a high note!

The correct answer is shake.

With verbs of physical perception like feel, we use the bare infinitive (the base verb without "to") to describe a complete, repeated action or fact.

A very common mistake is adding "to" before the verb, but remember: verbs of perception are allergic to the word "to"!

Question 6
Complete the ghost hunter's spooky blog post. Select ALL the words that correctly fill in the blank.
As I stood alone in the dark, abandoned hallway, I suddenly felt an icy hand ___ my shoulder!

The correct answers are touch and touching.

Feel is a verb of perception. It follows the pattern: feel + object + bare infinitive (touch) or feel + object + -ing form (touching).

"Touch" means you felt the whole quick action happen. "Touching" means you felt the action while it was happening. Both are grammatically correct and perfectly spooky!

Verbs of perception do not take the to-infinitive (to touch) or third-person singular verbs (touches).

Question 7

Help Detective Miller complete her observation report by dragging the correct verb forms into the blanks.

Detective Miller watched the suspect sipping his coffee nervously for several minutes. Suddenly, she saw him drop a small envelope onto the floor and quickly walk away.

Detective Miller watched the suspect sipping his coffee nervously for several minutes.

After verbs of perception like watch, see, or hear, we use the -ing form (present participle) to describe an action that is ongoing or in progress. The detective watched him over a period of time.

Suddenly, she saw him drop a small envelope onto the floor and quickly walk away.

We use the bare infinitive (the base verb without "to") after verbs of perception to describe a complete, finished action from start to finish. Dropping the envelope was a quick, completed event.

Question 8
Help Detective Barnaby finalize his dramatic surveillance report. Select ALL the sentences that are grammatically correct.

The correct answers are I watched the suspect sneak into the bakery and steal a cupcake. and I watched the suspect sneaking into the bakery through the back door.

The verb watch is a verb of perception. The grammatical pattern is: watch + object + bare infinitive OR watch + object + -ing form.

  • "Sneak" (bare infinitive) focuses on the completed action.
  • "Sneaking" (-ing form) focuses on the action in progress.

You cannot use a to-infinitive ("to sneak") or a conjugated verb ("sneaks") after the object.

Question 9
Complete the camper's slightly embarrassing ghost story by selecting the correct option for each blank.
I was sitting by the dark campfire, convinced the woods were haunted, when I suddenly felt my phone _________________________ exactly once in my pocket. I pulled it out in a panic, only to see my best friend _________________________ me from the tent right behind me!

The correct answers are vibrate and calling.

Felt my phone vibrate: We use the bare infinitive (vibrate) after a verb of perception (feel) to describe a short, completed action. The clue here is "exactly once."

See my best friend calling: We use the gerund (calling) after a verb of perception (see) to describe an action that is currently in progress. The phone was actively ringing when the camper looked at it.

Question 10

Complete the ghost hunter's spooky diary entry by dragging the right verb forms into the gaps.

While staying at the old mansion, I distinctly heard someone walking up the wooden stairs, step by heavy step. A moment later, I felt a freezing cold breeze brush past my shoulder and slam the bedroom door.

While staying at the old mansion, I distinctly heard someone walking up the wooden stairs, step by heavy step.

We use the -ing form after verbs of perception (like hear) when we perceive an action in progress. The phrase "step by heavy step" highlights that the action was ongoing.

A moment later, I felt a freezing cold breeze brush past my shoulder and slam the bedroom door.

We use the bare infinitive (without "to") after verbs of perception (like feel) for a sudden, completed action. You can also tell "brush" is correct because it shares a structure with the bare infinitive "slam" later in the sentence!

Gerund

  • I enjoy reading. — ❌ I enjoy to read.
  • She's good at swimming. — ❌ She's good at to swim.
  • He avoids making eye contact. — gerund after avoid
  • Running is good exercise. — gerund as subject

A gerund is the -ing form of a verb functioning as a noun. It follows verbs like enjoy, avoid, finish, mind and ALL prepositions. Never use an infinitive where a gerund is required.

Rule: after a preposition (at, in, of, about, without) → always gerund. After enjoy, avoid, finish, mind, suggest, deny → always gerund.

Infinitive

  • I want to go. — to-infinitive after want
  • She can swim. — bare infinitive after modal
  • Let me help. — bare infinitive after let
  • I enjoy to read. — wrong (enjoy takes gerund, not infinitive)

The infinitive has two forms: to-infinitive (to go) after verbs like want, decide, plan, hope; bare infinitive (go) after modals and causatives (let, make, help).

Rule: after want, need, decide, plan, hope, expect, agree, refuse → to-infinitive. After can, will, must, let, make → bare infinitive. After enjoy, avoid, finishgerund, NOT infinitive.

Object

  • Sam fed the dogs. — direct object (what was fed)
  • She sent him a present. — indirect object (who received it)
  • She waited for Lucy. — prepositional object (after preposition)
  • I gave her a book. — indirect + direct object together

An object is what a verb acts on or directs its action toward. Direct = the thing affected. Indirect = the recipient. Prepositional = after a preposition.

Test: Verb + what/whom? = direct object. Verb + to/for whom? = indirect object. After a preposition? = prepositional object.

Participle

  • a broken window — past participle as adjective
  • the running water — present participle as adjective
  • I have eaten. — past participle in perfect tense
  • She is sleeping. — present participle in progressive tense
  • I have went. — wrong (past tense, not past participle: use gone)

A participle is a verb form that also works as an adjective. Present (-ing): running, sleeping. Past (-ed or irregular): broken, written, gone. Used in progressive tenses, perfect tenses, passive voice, and as modifiers.

Trap: don't confuse past tense (went) with past participle (gone). After have/has/had → always past participle.

Sentence

  • She left. — simple (one independent clause)
  • She left, and he stayed.compound (two independents)
  • She left because she was tired.complex (independent + dependent)
  • She left because she was tired, and he stayed. — compound-complex

A sentence = one or more clauses forming a complete thought, ending with terminal punctuation. Four types based on clause structure: simple, compound, complex, compound-complex.

Minimum requirement: at least one independent clause with a subject + finite verb. Without that → fragment.

Verb

  • walk → walk / walks / walked / walked / walking (5 forms, regular)
  • go → go / goes / went / gone / going (5 forms, irregular)
  • be → am/is/are/was/were/be/being/been (8 forms)
  • can → can / could (modal: only 2 forms, no -s, no -ing)

A verb is the one word class every English sentence requires. Carries tense (when), aspect (duration), mood (attitude), and voice (active/passive). Regular verbs add -ed; ~200 irregular verbs have unpredictable past forms.

Key insight: fix your verbs and most grammar problems disappear. Wrong tense, wrong agreement, wrong form — verb errors account for the majority of grammatical mistakes.

B1 | Intermediate

  • If I had more time, I would travel more. — second conditional
  • The bridge was built in 1920. — passive voice
  • She said she was tired. — reported speech with backshift
  • Although it rained, we enjoyed the trip. — complex sentence with concession

These are B1 patterns — the CEFR intermediate level. At B1 you link ideas, use passive voice, handle reported speech, and manage second conditional — enough for travel, work basics, and everyday independence.

Marker: if you can explain why something happened and follow a news story, you're B1.

Medium

  • If I were you, I would apologise. — one rule (second conditional), but distractors like was tempt you
  • Answers require active thought, not instant pattern recognition
  • Vocabulary and context are realistic, not artificially simplified
  • Usually tests one rule, but the wrong answers are plausible

Medium marks middle-difficulty challenges: A2B1, one rule tested, but with realistic distractors that require genuine understanding.

Use "Medium" when Easy feels too obvious but Hard feels overwhelming. This is where most productive learning happens — the sweet spot of difficulty.