For most people, using SQL to transform data is a no-brainer, seeing it’s a very versatile language, and doesn’t have quite a steep learning curve compared to python. There are some cases where SQL is more suitable for a task, but the reverse can also happen as well.
For instance, given a string conversion problem:
- if a string occurs only one time, replace it with
- if a string occurs multiple times, replace it with
> one ### > three ###&& > Heartbreak hotel &&&&&#&&&##&#&&#
A solution in python would be:
from collections import Counter s = "three" s_counter = Counter(s) output_str = '' for char in s: if s_counter[char]>1: output_str += '&' elif s_counter[char]==1: output_str += '#'
But a solution in SQL is…(thanks Emily @data-engineering-discord!):
CREATE TABLE data ( string_value TEXT ); INSERT INTO data VALUES ('one'), ('three'), ('Heartbreak hotel'); select string_value, translate( lower(string_value), string_agg(chr, '' order by chr), string_agg(subs, '' order by chr) ) from ( select string_value, chr, case when count = 1 then '#' else '&' end as subs from ( select p.*, count(*) from ( select string_value, regexp_split_to_table(lower(string_value), '') as chr from data ) as p group by 1, 2 ) as q ) as r group by 1