In the past six months I’ve started to have finger, wrist, and forearm pain. I previously was using the Logitech Mx Mouse and keyboard for business. But due to the pronation of my wrists I wasn’t able to type at all without experience severe pain that would continue through the evening. so recently I’ve been using the Windows voice axis. To try and perform as much typing and as many actions on the computer as I can. This post Is entirely written using my voice. And as you can I sit I’m still having trouble making this a usable tool.

1. A good quality microphone helps but isn’t enough. Iterating through the available microphones, I have the built in one on my laptop, Logitech camera, Bose NC700, and the blue Yeti. By far, having a microphone right next to your mouth that is of reasonable quality will Undoubtedly improve the voice recognition available in windows. But it still will often misinterpret words, and it is necessary to speak more slowly and clearly than feels natural.

2. Actions interfere with typing and vice versa. I can’t type “open up” Because if I do say it. It will open up the start menu. Any of the keyword action for years often will confuse. often will confuse. the computer. into actually performing the action rather than typing the word’s out. This happens for keyword phrases like open focus, left, press, etc. It’s especially an issue when you’re trying to speak naturally, and you pause. In mid thought.

3. For some reason, the voice access compatibility with certain apps is limited. If I have Firefox Open, for example, and I say, focus on Firefox, I get an error that says Firefox doesn’t exist. Sometimes when I’m typing emails. the voice access will recognize what I say, but either it will come back with an error that says working on it for a long period of time, until I restart the voice access. Or it’ll tell me that the text was inserted when it really wasn’t. To fix this, sometimes I have to double press enter. It happens sporadically, and I’m not sure how to create a reproducible example.

Ford. 4 4. Sometimes it can take a bit as evidence here when misinterpreting words and having to repeat. Voice typing doesn’t work exactly like just speaking, because the computer will misinterpret pauses. The best results come from planning ahead and saying what you need to say as straightforwardly as possible. But of course, I don’t think many people speak like this. You can see above that many sentences might be cut up by a period. And this is just moments where I wasn’t sure what to say next.

5. The text setting editing tools are limited. And this makes it difficult to actually edit and work. updating one word in a sentence for example, forces me to reconnect my hands to a keyboard. adding a comma somewhere in the sentence can be irritating. here for example, I’m not sure why it’s decided to stop capitalizing the first word after a period. in the voice access toolbar, which appears on the top of your screen. It offers two settings. 1 is filter profanity and the other is turn on automatic punctuation. When you need to type many symbols, it’s very difficult. especially in the context of coding,

6. I still feel self conscious about speaking out loud. My office is on the wall next to a hallway, and I’m sure people can hear me. Especially when I have to repeat the same instruction over and over again.

7. It’s much better than what Windows previously had with voice typing. This is easily the best voice recognition tool that I’ve used. And you can see that for many sentences, it performs excellently. That said, I haven’t used many alternative tools, although I’ve been looking for some. I saw that serenade AI, which is a voice recognition tool specifically for coding, is available. but development recently has been slow. and it requires an Internet connection, which I don’t care for. especially for work. I’m not going to be able to use a tool that sends data back and forth.

8. when it works, voice typing feels infinitely faster than typing. That said, I’m not often typing like this natural naturally. on my daily workflow, which might involve working in Excel sheets, writing code or creating data visualizations, We’re working in common product software, It’s primarily mouse usage and short sentences that need to be replaced.

9. Gesture controls don’t seem to be widely available yet, and the tools that are are primarily repurposed tools. Toby eye tracker, for example, seems to be pivoting away from accessibility usage. The tap tools don’t necessarily exist. better. It just removes the keyboard, but I’m still going to have to move my wrists and fingers. Restructuring a mouse to make it more vertical, or use a trackpad. I think I’d prefer to just be able to use a camera to recognize gestures. But this technology is apparently very difficult and might require the use of multiple cameras. I haven’t seen many examples of it, but I’m trying to do more research. But the point here is that different mice don’t necessarily help me. The question is, how could I possibly replace a mouse? Dragon balls don’t necessarily seem to be recommended because they put a lot of stress on your thumbs and track pads that lay directly on the desk, force my wrist to bend at odd angles. Replace the word dragon with track. This is why I don’t use voice access to type out emails. without reviewing them and editing them.

10. It feels like there needs to be a second step in voice access. That reviews what you’ve said and then reorients it in the context of what is previously written. For example, combining sentences together when it makes sense. We’re updating the punctuation. or updating the punctuation. For example, if I repeat the sentence like that with just one word different, the software might infer that I want to replace that sentence with that 1 word difference rather than add it again afterwards. There are commands like delete that sentence. But having to intersperse the common language with commands like this can get frustrated. In natural language, when I’m in a conversation with somebody, I would just say the word over again. Microsoft does offer selection and correction tools for text, But they only seem to work in limited applications. For example, many don’t work while typing this post, and it takes a bit of practice to get used to what the commands are. Plus to remember them. For example, saying correct previous word here either goes to the next paragraph and highlights the first word and that 1. Or tells me to select a number, but doesn’t actually point out any numbers anywhere on the page. For example, trying to use the select previous word will also go to the next paragraph.

As you can tell. It feels like the majority of the work is done. But its ironing out these difficult situations. and creating a better compatibility with the actual software that you’re working in that seems to be the next major step. working in programs like Excel or VS code become very difficult. I just saw that Github copilot is going to be releasing a voice copilot that’s currently in preview. and I signed up. I’ve heard the dragon voice software is also a possibility. but many of these accessibility tools can become very expensive. Custom keyboards are easily within the hundreds of dollars buying specific Chester recognition cameras that are precise and that work with the software that you work with can also be expensive and rare to find. And of course, it needs to better recognize your accents.

As I test more software out, we’ll continue adding to this. to note the title is my microphone picking up what somebody was saying outside in the hallway.