tait.tech

Skip table of contents

Table of Contents

1. Accessible BIOS

Update: See my blog post with the guy who’s writing the new audio driver into EDK2.

Some server motherboards include serial UART/I2C ports which can be used to manage a BIOS via serial. If this is possible, would it be able to attach to a braille display via an intermediary like a Rockchip/Pi SBC or Arduino compatible chip using BRLTTY and serial input from the motherboard? Maybe not as it appears to require a full Unicode terminal, which I have the suspicion that BRLTTY will not be able to automatically filter out the formatting characters.

I found one paper referencing the (in)accessibility of BIOS, specifically UEFI BIOS from Brazil. I have downloaded the paper and uploaded it here for reference. PDF of “UEFI BIOS Accessibility for the Visually Impaired”.

After emailing the authors of the paper, I found out that one of them, Rafael Machado, was able to get a song playing in UEFI as a part of his masters. Here is a link to the Github Msc UEFI PreOS Accessibility; he has links to YouTube videos where he is shown playing a song on an ASUS laptop with PCIe connected speakers: Song Playing in UEFI

I have downloaded and played around with his Github project but to no avail. Either I am not setting it up correctly, or I do not have the proper sound setup, but in any case no sound plays from either my laptop or desktop.

This requires more research and investment to understand UEFI, HDA audio, what systems have it and how to work with words and other sounds.

2. Terminal-oriented browser

Use selenium to allow a cross-engine compatible terminal-browser with JS support. Yes, sure, it has all the bloat of the modern web as it uses the full code of Chrome/Firefox/Webkit—but at least it can be used in the terminal. Guaranteed to be accessible.

I’m thinking of similar key commands to Orca/NVDA but output is send to the terminal. Unsure of how to handle aria-live regions, but perhaps a queue could be used to print text. Unsure how to calculate delay as the user may be using a screen reader at different speeds and/or a braille display.

Change backend on-the-fly with a page reload. So if a website doesn’t work with Webkit, load it in Firefox with a key command.

Just an idea.

3. Dead Simple Chess App

I want to make a simple chess app which can connect to multiple backends (i.e. Lichess, Chess.com, etc.) and then users can use one app to play many games. This should be quite simple given how easy the lichess API is, and the chess.com API coming soon!

This is read-only data. You cannot send game-moves or other commands to Chess.com from this system. If you wish to send commands, you will be interested in the Interactive API releasing later this year.

4. Open-Source VPN Deployment System

Help my business and others start their own.

I love this idea, but unfortunately, Canada has data retention laws that would stop me from protecting the privacy of anyone using a system delivered by me. Unless I incorporate in Switzerland or the Seychelles, this is not a viable option. Doing the above costs a fair amount in up-front investment that I am not willing to make at this point in time.

5. 3d printing of Google Maps/OpenStreetMaps data for the visually impaired.

A larger project, to be sure, but one I think could be of interest. Imagine being able to download some data from Google or OpenStrretMaps, then put it into a program and have it generate a 3d map which can be printed. Unsure what to do, as the braille overlay on top of the streets and important buildings, etc. needs to be of a uniform size (braille cannot be scaled) but the buildings, streets, and parks do need to be scaled in size.

I think for beginning, forget the braille entirely and simply product a map. This can be done in the STL file format or some intermediary if that is easier. Roads will have a slight border on the side, parks will have a diamond texture, buildings will have slight rectangular borders (slightly wider than the roads), paths will be a thin line, and the label for the path will need to extend the thin line into a (rounded) rectangle with text on it.

Start with roads. Get a road, get it to generate the correct shape. Then add a border around the side. Then, add 4 more roads and figure out how to intersect them.

If it can be done on a display, it can be done in a file.

Start with that. Wow what a daunting project!

This is being worked on through the touch-mapper project. They do not, however, have labels yet.

6. 3D Printed Binary Trees

A simple hub/connection system to connect nodes of a binary tree together to have a physical object for visually impaired computer science students to use for initial introduction into the subject of (binary) trees.

6.5 Javascript Binary Trees

Have a simple module for loading in an SVG of a tree, along with Javascript to make the diagram accessible by jumping left/right with the arrow keys and up to a parent with the up arrow.

7. Lego/Pi-Powered Logic Gates

Lego or 3d printed logic gates with physical switches for in and out. Again, sore of as an introductory tool for blind students learning computer science.

8. More Tutorials/Materials

Perhaps a broader selection of materials for computer science students with proper transcriptions for everything in this list:

Although developing these is good, I think it is worthwhile to also create tools that make creation of these easier for both sighted and blind individuals. This will make it easier for course transcribers who are not tech-savy and will enable the blind student to create the diagrams and send them back to their teachers. Preferably have a “plain text” version which can be rendered as an SVG for use by visual learners, then make sure the SVG can be accessible with a Javascript hook. This would (in theory) make it possible for a teacher to create the graphic in the specialized tool for that kind of chart, put it in their slides/course info/textbook/whatever and have the student able to extract the SVG and paste it somewhere where a script could make it readable. Yes, the best case is the teacher cooperates 100%, but considering that is never the case, I figure making it easier to convert between the two is the best I can hope for.

Some other things I would like to do, if I could find the time:

This would all be licensed as CC-BY-NC-SA. I may drop the NC. As long as I have specified SA, then anyone (even for-profit companies) can use it as long as any changes are shared to the public as well.

9. Self-Voicing Modal Editor (like vim, but accessible)

Some pieces of conversation about it:

“Vim itself is fine, almost any plugin that puts things on the screen isn’t though because all Orca sees is a 2d grid of characters. That’s why I want to build my own modal editor, with a screen reader plugin. I’d build one for vim but I really can’t be bothered to figure out all of its quirks, and viml. I know I can write most of it in Python, but still, it wasn’t designed to allow speech to read the stuff plugins throw on the screen, so it wouldn’t work as well anyway.”

“Vim, which I intend to take inspiration from when it comes to modes and key bindings, doesn’t really fit with the standard key bindings for moving around a graphical app, and it’s hard, sometimes impossible, to replace all the OS standard behaviour in graphical apps, especially while keeping it accessible. Plus, what if you want to run this on a Raspberry Pi, with no desktop environment? What if you want to include it on an accessible Arch Linux install CD, with no Xorg or Wayland?”

“Not just that, earcons for acknowledgement of your actions. If a : command is successful, play a small sound. If it fails, play an error sound and read the error. If your cursor smacks into a line boundary, perhaps play a sound. It all depends on the user settings of course.”

All of these are from TheFakeVIP.

10. New Business Idea

Given the current lack of accessible content at universities (see #8, I think), what if I created a business around supplying accessible diagrams to the students. I would have multiple services, and would price them differently depending on my involvement.

  1. Tool Access
    • The university already has somebody with some programming experience doing the transcriptions.
    • The transcriber needs some tools to make his job easier.
    • Python scripts to create binary trees, stacks and queues, concurrent blocks, clock timing diagrams etc.
    • Cheapest option as this requires very little from me other than hosting online access to the tools.
    • Note that although all the source code would be open source, unless the transcriber is actually a software developer by trade (which is very unlikely) they will need access to some kind of web interface to these tools, and they will not be ready to set it up themselves.
  2. Consultation
    • This is what I think works best for most universities.
    • A standard run-of-the-mill transcriber will transcribe all plain-text portions of the documents.
    • I (we, the company) transcribe all diagrams using our ever-expanding arsenal of tools.
    • Flat rate per course per month, over time with more tool development this will work in our favour; at the start it might not quite be worth it, however.
  3. Full Transcription
    • I (we, the company) transcribe the entire set of documents, slides, assignments, reference material, etc. (Although no textbooks… Unless the teacher created the textbook themselves, it would be very hard to get copyright on it… see more on this in next section.)
    • Most expensive option
    • Allows further development of the web tools for our own good.
    • Opens some position for semi-skilled collage/uni students.
  4. Generic Tool Access (added on 2022-02-03):
    • Give access to tools to create a massive amount of diagrams standard to any university experience.
    • Allow it to be styled somewhat for convenience of the professor.
    • Sell the university this tool as a “generic” diagram creation tool. It should then include everything from a Gantt Chart to a timing diagram to a pie/bar chart to plot graphs, etc.
    • This allows me to sell it to them as a tool for the professors, instead of specialty transcribers.
    • Upsides for them:
      • Have all diagrams for the students accessible by default. A teacher need only “export as …” some accessible format for any student entering the class.
      • The above looks great on their reputation.
      • Even in the case they decide not to share any of the slides with the public (see next section), it helps the most people.
    • Downside for them:
      • Proffs need to learn a new tool (this could be a HUGE change for them, so it would need to be a very gradual adoption).
      • Cost is higher than Microsoft/Apple’s “free” editor tools.
      • They may already pay for enterprise MS licenses and don’t think it’s worth it to add more cost to something which may not work.
    • Including almost all diagram types imaginable is a HUGE task, and this cannot be understated.
    • Make each diagram a SEPARATE program/page as much as possible. Make them independent enough that a new format can be added to one tool fairly easily, but not at scale; it’s not necessary to have that level of abstraction.
    • Have a few tools/libraries for high-level manipulation of certain types of formats, for example:
      • A library to manipulate the PIAF version of a diagram, options might include: force capitalization, allow more colors than white/black, change font size. All of which would have warnings saying this is not recommended unless you understand the student’s accessibility needs very well.
      • A library to easily create audiograms. This could be useful for a clock timing diagram, some charts, etc.
    • Formats available, for more generic diagrams; obviously specialized diagrams will not have very many output options due to their complexity/inaccessibility across more than one or two formats.
      • HTML/SVG with/without JS (without is EPub compatible)
      • Audiogram
      • PIAF (microcapsule paper) diagram (this could be created via R w/ Braille font in cases where it’s actually data; then use normal fonts otherwise?)
      • Plain text
      • R output may be useful (?)
      • Spoken (would require some kind of vocalizer commercial license)
    • Each type of diagram simply needs a converter created for it called DIAGRAM-TYPE_TO_FORMAT-TYPE and it should become automatically available.
    • Language possibilities:
      • Rust (Desktop)
      • Rust (WebAssembly?: Desktop/Mobile web)
      • JS (Desktop/Mobile web)
      • Swift (Mobile native)
      • Java (Mobile native)
      • Rust (mobile native? on Android?)
    • I would prefer Rust (with some exceptions, like the audiogram tool, which is easier in shell script due to its simplicity) for the following reasons:
      • If I change any type information, it stops me from leaving errors in the program. The compiler is very opinionated.
      • Easy to implement tests and compare previous outputs to current ones to see if they are still a match after refactor.
      • Dependency management is included for free.
      • Public place to share code/libraries
      • Gives me many options as to where to run the binaries: I think you can compile to ARM (iOS), WebAssembly and native x86-64 for Windows/MacOS/Linux support. I want tools to be universal, so this is the risk I take.
    • Offer things for the following disabilities/accessibility needs:
      • Dyslexia – dyslexic fonts
      • Blindness – tactile/digital/audio formats
      • ADHD – plainer formats to have less distraction
      • MacOS users – No MS product involved, no proprietary formats. (Or at least the option to have a non-proprietary format.) Especially useful in design/business.
      • Linux users – No MS product involved, no proprietary formats. (Or at least the option to have a non-proprietary format.) Especially useful in I.T.
      • Anything else – Having alternate formats easily available to you as a professor allows new ways of teaching. Perhaps they decide learning from audio is better for a certain thing they are creating, they can just do it… no extra work required.

10.1 Interface

What will this look like, let me dream for a second here:

So here’s my plan for sticking to “free culture” licensing while still maintaining profit:

  1. License all code under the GPLv3
  2. License all transcribed documents under CC-BY-SA-NC, and add transcribed files to a directory of information available to anyone.
    • (CC) You may change and redistribute our content.
    • (BY) You must credit the company.
    • (SA) You must keep the same license.
    • (NC) You may not make money selling the documents.
  3. If a school is willing to make the transcribed version of their courses available to the public, then I will offer a pretty substantial discount, probably in the range of 25-ish percent.

I think it would be a great deal better for schools like SAIT, or AUArts, or VCC (smaller schools) to just contract out the hard stuff like this. Given my experience with a larger school (SFU), it makes me think even large schools could use help with it.

10.5 Tactile Diagram Creation Tool

For the case of a diagram like a clock timing diagram, which is basically impossible to just “write” in any exact way, make a tool which can do the following:

  1. Take an image upload (a screenshot from a slide deck, preferably.
  2. Grayscale the image.
  3. Find text by OCR and offer to automatically delete it and replace it with braille.
    • Allow repositioning via an advanced feature set; automatically draw a line to where the OCRed text originally was.
    • This should be relatively straightforward, even with a complex diagram, you can always move the braille outside where the text is.
  4. Save image, print, run through special printer to make black “pop” off the page.
  5. Send to student (by courier if local) and the fastest possible shipping by Canada Post/DHL/UPS otherwise. This is expensive, but necessary on tight deadlines.
    • The other option is to have a friend in various cities around Canada have a special printer and special paper that they can print on demand for me. More money, again, but also might be worth it depending of if I know I will have a consistent base of clients in a given area.

This seems like something that, if it does exist, is probably proprietary and costs the same as my services would as a whole.

10.6 a “slideshow” thingy, but accessible(?)

For every diagram I create, add a “presentation” widget which can take any list of diagrams and present them in an accessible sequence. I will explain.

So imagine you have 5 diagrams of a binary tree, each with a specific action being done to the binary tree to show it changing over time. So you insert some elements, remove some elements. Attach an aria-live region to a comment for each diagram and have hotkeys which can load the next and previous diagram.

So for example you could have a binary tree that looks like:

With the label “this is a height=2 perfect binary tree”. Let the user know somewhere that capslock+alt+n will open the next diagram (or have a button). When the user presses the button, aria-live will announce “Notice where the new node was inserted.”, and display the following diagram:

Keep the user on the same positioned node across a new diagram (unless it is a different type). If the user is focused on “right node” on the first slide, make sure they end up on “right node” on the second slide. Now the user can navigate with context to where they were previously. In the case where a diagram has a corollary twin or equivalent diagram to show in some other form: keep the users’ focus on the correct node in the diagram across a move to a new “slide”?

Probably easy to do if I tag every diagram node with id: "abc123" and do the same on the next diagram. If there is a matching node_id attribute on the new diagram’s elements, then focus there automatically.

Use an Odilia/NVDA plugin to make some of this content even more accessible! Who says open-source can’t have vertical integration?

Ok so something like .patch files can be hard to read, even for sighted individuals, but perhaps with more integration on a screen reader level, you could get earcons for addition, removal and stationary lines of code.

Another example:

Perhaps a screen reader user wants to know what type of syntax is being highlighted, this is pretty easily done, even on a static site, with CSS highlighting and a good syntax highlighter which can parse your text and turn it pretty with span elements. Now, what if, through a plugin, you could hit something like CapsLock+Alt+T to get the type of token you are currently focused on (i.e., “variable”, “function”, “object”, “struct”, “class”, “keyword”, etc.) I’m not sure if this one is useful at all, I just know I like my syntax highlighting lol!

I bet there are a TON of uses for vertical integration through NVDA/Odilia plugins. I will keep writing them down somewhere here…. It really depends on the target audience as Odilia is targeted to already technical people and NVDA is primary used by Windows users—this is to say that an NVDA users will, on average, be less technical than an Odilia user.

11. Addon/Adapter for white canes

UPDATE: Although not ideal in all ways, there is something very similar to this already on the market called We Walk.

Technological white canes do exist, but they are specialty devices with huge, thick tops and handles. Essentially the idea is to somehow fit all that tech into a smaller and smaller device so that it can eventually be sold as an assembly that a blind individual can put together themselves. This will work with their existing cane, instead of requiring a speciality device. There must be a way to embed a tiny microcontroller to at least handle sensor data into a bunch of pin outs (preferably, analog). This would enable a secondary device to do things with that data, like send it to a phone app, or output as some kind of audio via a headphone jack. Even if it requires a small additional box to handle the larger device (the one the actually processes the information), this would be significantly cheaper than purchasing a new cane or needing specialized repair services. The kit will include some kind of serial cable which can be used to connect to the pieces of tech specifically. This will allow updating of the software/firmware (potentially, updating the software for an ATMega* is extremely difficult without additional hardware; maybe impossible; we’re looking to reduce the size as much as possible.

11.1 Genericize

Genericization (as I like to call it), is trying to use the most generic possible information, program, product, etc. and to improve upon that instead of trying to create something entirely new from scratch. This is something which seems to be missing in many accessibility applications; that said, it is somewhat used for example with screenreaders: screenreaders allow blind individuals to browse the internet and read documents on the computer in a somewhat comparable way to their sighted counterparts. Applying this principle to white canes, we want to improve upon the existing primitive long white stick to make it more functional with modern technology. Always optional, always have many standard ways to use it (USB, audio jack, generic serial port if needed). This is related to the two UNIX principles “always use text as it is a universal interface” and “do one thing and do it well”; in other words: do not recreate the cane; it is more or less perfect the way it is! Instead, offer an optional and improved experience on top of using a cane.

Of course, all software, hardware, layout diagrams, etc. will be open to the public, available as embosser files, etc. to enable blind tech nerds to do whatever they want with it. If one person has a use for some feature, somebody else probably also does.

11.2 Speaker Lock-On

Not sure if this is possible with current technology, but if it is, this could be worth something: what if you could listen for the loudest decibel rating in a 360° ring around the outer edge of a cane and attempt to “lock on” to the sound, then use positional audio (either via iOS’s API or some more advanced open-source audio interface) continue to make a beep noise when getting closer/looking the correct direction.

I understand that this is not really that useful all the time because anything which makes noise, usually will continue to make noise and you can just go up to it with your human ears. It’s more like a “save” button for audio position. Would need to use probably some inferred sensors to detect distance, and obviously this would only work with direct line-of-sight targets.

11.3 Advanced GPS, integrated

Again, this may be already outdone by the iPhone and Android phone already available to us in our pockets, but hear me out: What if you only had to walk a route once to know where to go next time?

I know this would likely create a rift between the “traditional” and “new age” approach almost instantly. The traditionalist would argue, with some merit, that the inability to navigate without a charged battery or your specific cane could decrease independence in those who use it. The new agers would argue, also with some merit, that this enables somebody to walk a blind student through a few major routes on a university campus or downtown core (for example) only once, then (with a combination of high-end GPS, barometers, accesslerometers (TODO), etc.) the cane itself could record your route.

It could have messages like left and right turns at near exact places, “stairs/escalator ahead, going up” messages. Maybe even identification of material depending on how complete our dataset is and if the cane user is using a two-point-touch system or rolling the cane back and forth across the surface. The most useful case, I think, is being able to upload your (anonymous) routes to a service which can store all these routes on a public database. Obviously, you would need some kind of anonymization feature to stop people from getting your address or whatever. Or, depending on the verbosity of the data, a way to “smooth out” the exact cane movements so as not to identify the cane holder.

Having a centralized place to store these maps (for the public, all the data should be useable and clonable) could potentially make it so there is only one blind student who needs to navigate the campus a few times and then be able to share that data with everyone else.

Extensions

Perhaps it could even be extended with such thinks like “there is a sign which says XYZ here or ABC there”? This to me is even more powerful! A partially-sighted white-cane user (or a blind user and a guide) could map all signs, hallway intersections, meeting points, etc. into the map. The details of this, I am not sure of; perhaps to start off there is just an additional button available which can do something like “save landmark” or some such thing. Then, once opened on a computer (or mobile app?) you could add information to each point like “sign says XYZ” or “corner of Main St. and 1st Ave”. Are current GPS/barometer/accesselrometer, etc. accurate enough to do this?

This would also make searching for a “new route” very efficient, if we’re able to match across already published routes. Then, we are able to connect “Main St. & Main Ave. to Train Station” and “Train Station to grocery store” routes together into one continuous route. I could see issues with this due to possible differences in where the routes exactly start and end. Perhaps this is too advanced and is better suited to a mobile app and not a cane extension (as in, only use sensors on the cane, do not have any additional functionality beyond that). I think the possibilities here are interesting; perhaps there is the possibility of using something like this just inside university campuses and malls. I could understand if this wouldn’t beat Google maps most times; that said, it’s a route another white-cane user has exactly taken before, so it is gaurenteed to work unless something in the environment has changed since then. // TODO

Obviously there are a lot of other things to consider like user privacy and how to avoid the sensor signals from the cane from getting either lost, intercepted, modified, etc.

Security is always a concern, no matter how remote the possibility is of some kind of attack.

12. Meme Template Regocnition

Meme Template Recognition (MTR) should OCR the plain text found in a meme, and also lookup the closest match in terms of which meme template is used. Have each template with a description of the meme and how it is generally used.

13. Odilia Ideas

Odilia is a new Linux screenreader project. I have some ideas about it to write down here.

13.1 Cache

A cache would be relatively simple to implement, or so I think right now. If I watch the output of dbus-monitor when Orca is running and it shows me all my accessibility events, I can see that it basically boils down to extremely simple events:

I think it would be (fairly) easy to cache the entire a11y tree of at least a web container, but maybe other types of interfaces as well. I would not store all information about each item. Only critical information to VASTLY speed up things like structural navigation.

For example: if you look for a next header in a web document by pressing “h” in browse mode, a DBus lookup for every element’s parent, AT-SPI role, and children needs to be made over DBus. If it could be possible to do these intensive tasks within native Rust code, without calling out to DBus, this could be a game changer. It would make it INSANELY fast for structural nav, and if it needs to call out to DBus for the text, that’s fine by me. Storing text could potentially balloon the memory usage pretty hard anyway, so I’m not sure that’s a great idea.

So, store some basic info like role, id, parent id, ids of children, maybe a few more. All in all it should represent less than 50 bytes on average. This means that even with a million items, which I hope someone would never be so unfortunate as to need that many, it could all be contained safely within about 50MB of RAM. It would be worth the tradeoff in nearly every single case I can think of. And, even if we added, say a 100-byte string buffer for smaller strings to be cached…. Then we’d still be around 150MB of ram for 1 million elements cached—I would like to point out that this is stupendously high, and is fairly impractical to begin with. I think the memory-time tradeoff will win in this case.

At the time of writing, the Odilia screenreader is not far enough along to either benchmark or implement these features, but I just need to write them down before I forgot why it was a good idea. I think, personally, that a small string buffer for practically-sized strings would be ideal, as this could also speed up potential addons (think control+f but for an entire GUI application, not just the web) and this would still approach fair RAM usage and impeccable speed—although I do understand that this view may go a little far for the other two developers working on that project. We’ll see.