Unix pipe to clipboard with WSL, and in UTF-8

2024-04-17
WSL is good, it will be even better if we can pipe stdout directly into Windows host’s clipboard. Just like ./myscript.sh | to-my-clipboard.sh Note that xclip requires X, so it is not an option. From WSL, we have direct access to Windows executables. The easiest method would be using clip.exe: ./myscript.sh | clip.exe The method works fine if you don’t touch CJK characters or emojis. clip.exe didn’t handle the encoding well since the default codepage in Windows is not 65001(UTF-8). Continue reading

Several tips for better working with Python in Excel

2023-10-10
After enrolled for Python in Excel preview, now I can type =py( in any Excel cell to write some Python code. Python in Excel doesn’t have detailed documents, only Get Started tutorials, like link 1, link 2, or articles talking about topics like pandas, matplotlib, seaborn, etc., it may confuse you when you want to do some real work in an unfamiliar environment. The article notes several tips from my understanding of Python in Excel. Continue reading

Analyze robots.txt with Python Standard Library

2023-09-24
If haven’t searched both “python” and “robots.txt” in the same input box, I would not ever know that Python Standard Library could parse robots.txt with urllib.robotparser. But the official document of urllib.robotparser doesn’t go into detail. With the document, you could check whether a url can be fetch with a robot with robot_parser_inst.can_fetch(user_agent, url) if you are building a crawler bot yourself. But if you want to do some statistics about robots. Continue reading

Process AWS Kinesis Firehose data with Python

2023-09-11
Pipelining streamed data events directly into Amazon S3 via the AWS Kinesis Firehose service is convenient and efficient. And we could use Python with boto3 to consume the data directly from S3. This allows for seamless storage of your data, ensuring its integration and accessibility. Mostly, we are dealing with JSON-formatted event logs. But there is one tiny stone in the shoe for logs feeding from AWS Kinesis Firehose, there is no newline between consecutive log entries. Continue reading

The curious cases of json_extract

2020-04-25
json_extract is a function for extract data from a JSON document both in MySQL and MariaDB. The function normally works fine, for example set @j = '{"num": 42, "list": [1, 2, 3], "obj": {"name": "Edward Stark"}}'; select json_extract(@j, '$.num') as num, json_extract(@j, '$.list') as list, json_extract(@j, '$.obj') as obj; +------+-----------+--------------------------+ | num | list | obj | +------+-----------+--------------------------+ | 42 | [1, 2, 3] | {"name": "Edward Stark"} | +------+-----------+--------------------------+ But when it comes to a single JSON string, the results of json_extract is not always as expected. Continue reading

Contextvars and Thread local

2019-04-12
Here in the post, I will share some examples about the contextvars (new in Python 3.7) and thread local. Default Value In the module level, use ContextVar.set or directly setattr for a thread local variable, won’t successfully set a default value, the value set won’t take effect in another thread. To ensure a default value, for contextvars import contextvars context_var = contextvars.ContextVar("context_var", default=0) for thread local, a sub class of thread. Continue reading

Mongodb(via MongoEngine) join query with aggregate

2017-01-09
Since Mongodb 3.2 and MongoEngine 0.9, we can use $aggregate command to perform join queries on multiple collections in a database. This post would be a simple tutorial for join queries on Mongodb(via MongoEngine in Python) with examples. Models Setup Let’s consider models defined as below: import random import mongoengine class User(mongoengine.Document): meta = {"indexes": ['rnd']} name = mongoengine.StringField() rnd = mongoengine.FloatField(default=random.random) class Group(mongoengine.Document): meta = {"indexes": ['rnd']} name = mongoengine. Continue reading