chiark / gitweb /
metadata: fix text wrapping on unicode characters
authorDaniel Martí <mvdan@mvdan.cc>
Sun, 10 Jan 2016 16:54:38 +0000 (17:54 +0100)
committerDaniel Martí <mvdan@mvdan.cc>
Sun, 10 Jan 2016 16:54:38 +0000 (17:54 +0100)
We were passing the utf-8 encoded string to textwrap, which took the
bytes as characters. Hence multi-byte unicode characters (in utf-8)
would count as multiple columns, which is clearly wrong.

fdroidserver/metadata.py

index 6e8bdf32ebd73bc83e8a6b31ac265f9430986576..b1924357eaf8af91d6c4841fcf5db69baf8eb98b 100644 (file)
@@ -525,9 +525,10 @@ class DescriptionFormatter:
         self.state = self.stNONE
         whole_para = ' '.join(self.para_lines)
         self.addtext(whole_para)
-        self.text.write(textwrap.fill(whole_para, 80,
-                                      break_long_words=False,
-                                      break_on_hyphens=False))
+        wrapped = textwrap.fill(whole_para.decode('utf-8'), 80,
+                                break_long_words=False,
+                                break_on_hyphens=False)
+        self.text.write(wrapped.encode('utf-8'))
         self.html.write('</p>')
         del self.para_lines[:]