Thoughts on summarization service system design

For a summarization task, there should be an input, in which it’s reduced to a handful of paragraphs. This input is in text format. You don’t necessarily start from a text format though, since the source content can be audio or video files. But this means at the end, the source input has to be converted into a text format, and this involves a transcription task. Transcription means taking an audio, then convert it to text. Luckily these days there are APIs you can use to achieve this. Depending on each API provider, but it’s safe to assume most would support WAVE or FLAC encoding. ...

June 9, 2024 · 3 min · Karn Wong

Some problems can be solved with workflows

When we face with engineering problems, it’s too easy to fall into the trap thinking it should be solved with a technical solution. Seasoned engineers think differently, because they realize that most of the time, it’s “people” or “workflow” problems. Let me provide a few examples. Management wants analysts to use Jupyter notebook to reduce time required to create a routine report Background: Most analysts are comfortable using Microsoft Excel to work with data, some can also use SQL, but it’s rare for analysts to be familiar with Python. Jupyter notebook is an interactive development interface for data works, since users can execute a chunk of code at a time, and render data without requiring re-running the full code. Problem: every month a two-person analyst team would spend two days stitching up multiple CSV files (can be up to 60) via VLookup for a monthly report. This is because analysts have to look up information for each record, in which they use a template query and manually execute 50 queries with changed parameters. ...

November 24, 2023 · 3 min · Karn Wong