Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Like others have said, pup seems to be abandoned. It's surprising that there isn't a well supported standard tool to work with html in the terminal, like jq for json.

Last time I looked at using pup or similar I wanted to extract two values for each element. For example, let's say I have the following html:

  <div class="image">
    <p>Sunset in Hawaii</p>
    <img src="../randomstring123.jpg">
  </div>
  <div class="image">
   ...
Now I'd like to get both the image description, and the source, for each similar image in the page. Preferably so I can pipe it to curl and do

  curl -o "$description.jpg" "$url"
I couldn't find an easy way of doing it, so I used Python instead.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: