Unscrambling Files - some progress

Declan Moriarty junk_mail at iol.ie
Fri Nov 4 16:39:57 PST 2005


Recently, Somebody Somewhere wrote these words
> On Fri, 4 Nov 2005, Declan Moriarty wrote:
> 
> >There is a FIELDWIDTHS declaration I discovered, which when you
> >delare it is used instead of FS, e.g.
> >
> >FIELDWIDTHS = { 12 3 19 3 2 3 47 }
> >
> >where the threes are space fields
> >
>  Interesting.
> 
> >I never got round to deleting info. I'm sure many do.  I have LFS-5.0
> >in everyday use. I'm backward, but not that far back.
> >
>  ;-)
> 
> >>
> >> Generally, scripting questions fit better on lfs-chat, and you are
> >> more likely to get good answers there.
> >>
> >I don't do chat, largely because I talk too much.
> >
> >I may join if I can make any headway with the file. But everything
> >has to be described in posix regexes, which are not as handy as perl
> >imho
> >
> >If I can't describe the sections in posix REs, I can't bring awk or
> >sed to bear on it. I want to split each line  at one very awkward
> >point.
> >
> 
>  Most of us find perl is a lot harder than awk or sed, but yes the
>  regexes can be "simpler".  How about using some perl from the shell
>  to write the regexes, along the lines of

I can't do perl at all, but I can handle the regexes (Largely from the
marvellous education spamassassin gives). Things work - that's the
difference. For instance

egrep '^\d{12}' does _not_ find lines starting with 12 digits, but

pcregrep '^\d{12}' does. Have I bad breadth or something.

> 
>   cat $1 | perl -p \ -e 's/-your-regexp-here-/-and-here-/g;' -e
>   's/another-regexp/and-output/g;'
> 
>  and pipe the output to awk to format it.
> 
>  Or, create a little test data (in the form you expect to come out of
>  the regexes), hack up some awk to print it, then try a2p on the awk
>  to get it into perl if you are happier with that.  Break the problem
>  down into manageable chunks.  Use the tools you are most comfortable
>  with.

You're giving me good advice, Ken, and thanks. I was just ready for a
home for the bewildered after grokking the gawk manual.

-- 

	With best Regards,


	Declan Moriarty.



More information about the blfs-support mailing list