Be very descriptive, in whatever medium you use.
Detailed description of the environment, NPCs and scene make the game come alive.
Engage multiple senses when you describe the scenario. The sights, sounds and smells the player characters perceive can contribute to their immersion. This can be oratory or even just in text.
You enter the tavern. There are three tables in the room, with two commoners playing a dice game at one. A lantern on the counter provides the room with dim light. A bartender stands behind the counter.
You walk into a small, wood-walled tavern smelling of cheap ale and stale bread. Three table are spread haphazardly through the room. Two old men are gambling at one of the tables, intently watching their dice clacking loudly as they clatter on the rough wooden table. Dim light from a single dusty lantern bounces off the walls. A bartender behind the counter sees you enter and mutters quietly to himself.
Consider using images, maps, or (rarely) linked video.
Even in a primarily text-based format you can use images to increase immersion.
A little can go a long way. Even just an introductory graphic of a setting or place can make a big difference. Players can easily visualize their surroundings when they can actually see an example.
Images representing monsters or important NPCs make useful tools, even at an in person tabletop game.
You may still be able to use audio effects remotely.
Google Hangouts is deprecated, consider using Google Chat for text, or Google Meet for voice or video chat. Other options for audio or video chat include Discord, Zoom or built in chat systems in a virtual tabletop like Roll20.
You can include sound effects or music in most virtual tabletop apps, or using a sound mixer like
you can mix music and sound effects in with you microphone audio.
Even without mixing in prerecorded effects, you can still use accents over audio chat.
The practice you've put into different voices and accents for your NPCs could be conveyed over voice chat, or video chat. Video also gives you the ability to use gestures and facial expressions, but does require more hardware and setup. Depending on your needs it may be worth the investment though.