Re: <3

Date: 2024-09-08 01:05 am (UTC)
garote: (Default)
From: [personal profile] garote
Unfortunately I'm not sure how to fix this issue. It's possible that LJ sends the encoding information as part of the XML when one fetches an entry, e.g.:

<?xml version="1.0" encoding="WINDOWS-1251"?> ..... </xml>

and if so, that can be used to decide what encoding to use when converting it to Unicode. But right now, unless there's some magic happening in the Python XML parser I don't know about, it always assumes UTF-8 so stuff in e.g. WINDOWS-1251 will get mangled.

LJ renders it just fine when presenting its own web interface, so either LJ preserves the encoding information internally, or it follows some kind of guessing procedure to convert it to UTF-8. One could theoretically answer that question by crawling through the LJ source code.
(will be screened)
(will be screened if not validated)
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

garote: (Default)
garote

May 2025

S M T W T F S
    123
45678910
11121314151617
18192021222324
252627 28293031

Most Popular Tags

Page generated May. 29th, 2025 12:13 am