sed script to remove file name duplicates

Posted by dma_k on Stack Overflow See other posts from Stack Overflow or by dma_k
Published on 2010-05-19T16:43:04Z Indexed on 2010/05/19 17:10 UTC
Read the original article Hit count: 182

Filed under:
|

Dear community,

I hope the below task will be very easy for sed lovers. I am not sed-guru, but I need to express the following task in sed, as sed is more popular on Linux systems.

The input text stream is something which is produced by "make depends" and looks like following:

pgm2asc.o: pgm2asc.c ../include/config.h amiga.h list.h pgm2asc.h pnm.h \
 output.h gocr.h unicode.h ocr1.h ocr0.h otsu.h barcode.h progress.h
box.o: box.c gocr.h pnm.h ../include/config.h unicode.h list.h pgm2asc.h \
 output.h
database.o: database.c gocr.h pnm.h ../include/config.h unicode.h list.h \
 pgm2asc.h output.h
detect.o: detect.c pgm2asc.h pnm.h ../include/config.h output.h gocr.h \
 unicode.h list.h

I need to catch only C++ header files (i.e. ending with .h), make the list unique and print as space-separated list prepending src/ as a path-prefix. This is achieved by the following perl script:

make libs-depends | perl -e 'while (<>) { while (/ ([\w\.\/]+?\.h)/g) { $a{$1} = 1; } } print join " ", map { "src/$_" } keys %a;'

The output is:

src/unicode.h src/pnm.h src/progress.h src/amiga.h src/ocr0.h src/ocr1.h src/otsu.h src/barcode.h src/gocr.h src/../include/config.h src/list.h src/pgm2asc.h src/output.h

Please, help to express this in sed.

© Stack Overflow or respective owner

Related posts about linux

Related posts about sed