[LRUG] UTF8 errors parsing mail file

Najaf Ali ali at happybearsoftware.com
Thu Aug 22 08:59:47 PDT 2013


What Fred said. I don't know anything about Perl, but my guess is that it's
loading all the files as regular old byte streams whereas Ruby is choking
on some invalid UTF-8 in your files. If you want to inspect your files for
the bad chars, piping them into hexdump -C might yield a few clues (as Fred
mentioned, vim won't be of much use here).


On Thu, Aug 22, 2013 at 4:27 PM, Frederick Cheung <
frederick.cheung at gmail.com> wrote:

>
> On 22 Aug 2013, at 16:21, gvim <gvimrc at gmail.com> wrote:
>
> > I'm encountering some UTF-8 errors in Ruby 2.0. When installing gems I
> often see non-fatal errors relating to conversion of ASCII characters to
> UTF-8. The following script is designed to search a large Maildir folder
> for lines beginning with 4 word characters:
> >
>
> Are those files guaranteed to contain only valid utf-8 ? If not then if
> you might be able to get away with opening them as ascii-8bit (assuming
> that you don't need to work with them in a unicode aware way)
>
> Fred
>
>
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>



-- 
Ali, http://happybearsoftware.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lrug.org/pipermail/chat-lrug.org/attachments/20130822/4a124e2a/attachment-0003.html>


More information about the Chat mailing list