sed script to remove file name duplicates
- by dma_k
Dear community,
I hope the below task will be very easy for sed lovers. I am not sed-guru, but I need to express the following task in sed, as sed is more popular on Linux systems.
The input text stream is something which is produced by "make depends" and looks like following:
pgm2asc.o: pgm2asc.c ../include/config.h amiga.h list.h pgm2asc.h pnm.h \
output.h gocr.h unicode.h ocr1.h ocr0.h otsu.h barcode.h progress.h
box.o: box.c gocr.h pnm.h ../include/config.h unicode.h list.h pgm2asc.h \
output.h
database.o: database.c gocr.h pnm.h ../include/config.h unicode.h list.h \
pgm2asc.h output.h
detect.o: detect.c pgm2asc.h pnm.h ../include/config.h output.h gocr.h \
unicode.h list.h
I need to catch only C++ header files (i.e. ending with .h), make the list unique and print as space-separated list prepending src/ as a path-prefix. This is achieved by the following perl script:
make libs-depends | perl -e 'while (<>) { while (/ ([\w\.\/]+?\.h)/g) { $a{$1} = 1; } } print join " ", map { "src/$_" } keys %a;'
The output is:
src/unicode.h src/pnm.h src/progress.h src/amiga.h src/ocr0.h src/ocr1.h src/otsu.h src/barcode.h src/gocr.h src/../include/config.h src/list.h src/pgm2asc.h src/output.h
Please, help to express this in sed.