libcurl - unable to download a file

Posted by marmistrz on Stack Overflow See other posts from Stack Overflow or by marmistrz
Published on 2012-07-05T17:03:29Z Indexed on 2012/07/06 9:16 UTC
Read the original article Hit count: 265

Filed under:
|
|
|

I'm working on a program which will download lyrics from sites like AZLyrics. I'm using libcurl.
It's my code

lyricsDownloader.cpp

#include "lyricsDownloader.h"
#include <curl/curl.h>
#include <cstring>
#include <iostream>

#define DEBUG 1

/////////////////////////////////////////////////////////////////////////////


size_t lyricsDownloader::write_data_to_var(char *ptr, size_t size, size_t nmemb, void *userdata) // this function is a static member function
{
    ostringstream * stream = (ostringstream*) userdata;
    size_t count = size * nmemb;
    stream->write(ptr, count);
    return count;
}


string AZLyricsDownloader::toProviderCode() const
{ /*this creates an url*/ }

CURLcode AZLyricsDownloader::download()
{
    CURL * handle;
    CURLcode err;
    ostringstream buff;
    handle = curl_easy_init();
    if (! handle) return static_cast<CURLcode>(-1);
    // set verbose if debug on
    curl_easy_setopt( handle, CURLOPT_VERBOSE, DEBUG );
    curl_easy_setopt( handle, CURLOPT_URL, toProviderCode().c_str() ); // set the download url to the generated one
    curl_easy_setopt(handle, CURLOPT_WRITEDATA, &buff);
    curl_easy_setopt(handle, CURLOPT_WRITEFUNCTION, &AZLyricsDownloader::write_data_to_var);
    err = curl_easy_perform(handle); // The segfault should be somewhere here - after calling the function but before it ends
    cerr << "cleanup\n";
    curl_easy_cleanup(handle);

    // copy the contents to text variable
    lyrics = buff.str();
    return err;
}

main.cpp

#include <QString>
#include <QTextEdit>
#include <iostream>
#include "lyricsDownloader.h"

int main(int argc, char *argv[])
{
        AZLyricsDownloader dl(argv[1], argv[2]);
        dl.perform();
        QTextEdit qtexted(QString::fromStdString(dl.lyrics));
        cout << qPrintable(qtexted.toPlainText());
        return 0;
}

When running

./maelyrica Anthrax Madhouse

I'm getting this logged from curl

* About to connect() to azlyrics.com port 80 (#0)
*   Trying 174.142.163.250... * connected
* Connected to azlyrics.com (174.142.163.250) port 80 (#0)
> GET /lyrics/anthrax/madhouse.html HTTP/1.1
Host: azlyrics.com
Accept: */*

< HTTP/1.1 301 Moved Permanently
< Server: nginx/1.0.12
< Date: Thu, 05 Jul 2012 16:59:21 GMT
< Content-Type: text/html
< Content-Length: 185
< Connection: keep-alive
< Location: http://www.azlyrics.com/lyrics/anthrax/madhouse.html
< 
Segmentation fault

Strangely, the file is there. The same error is displayed when there's no such page (redirect to azlyrics.com mainpage)

What am I doing wrong?

Thanks in advance

EDIT: I made the function for writing data static, but this changes nothing. Even wget seems to have problems

$ wget http://www.azlyrics.com/lyrics/anthrax/madhouse.html
--2012-07-06 10:36:05--  http://www.azlyrics.com/lyrics/anthrax/madhouse.html
Resolving www.azlyrics.com... 174.142.163.250
Connecting to www.azlyrics.com|174.142.163.250|:80... connected.
HTTP request sent, awaiting response... No data received.
Retrying.

Why does opening the page in a browser work and wget/curl not?

EDIT2: After adding this:

curl_easy_setopt(handle, CURLOPT_FOLLOWLOCATION, 1);

The log is:

* About to connect() to azlyrics.com port 80 (#0)
*   Trying 174.142.163.250... * connected
* Connected to azlyrics.com (174.142.163.250) port 80 (#0)
> GET /lyrics/anthrax/madhouse.html HTTP/1.1
Host: azlyrics.com
Accept: */*

< HTTP/1.1 301 Moved Permanently
< Server: nginx/1.0.12
< Date: Fri, 06 Jul 2012 09:09:47 GMT
< Content-Type: text/html
< Content-Length: 185
< Connection: keep-alive
< Location: http://www.azlyrics.com/lyrics/anthrax/madhouse.html
< 
* Ignoring the response-body
* Connection #0 to host azlyrics.com left intact
* Issue another request to this URL: 'http://www.azlyrics.com/lyrics/anthrax/madhouse.html'
* About to connect() to www.azlyrics.com port 80 (#1)
*   Trying 174.142.163.250... * connected
* Connected to www.azlyrics.com (174.142.163.250) port 80 (#1)
> GET /lyrics/anthrax/madhouse.html HTTP/1.1
Host: www.azlyrics.com
Accept: */*

< HTTP/1.1 200 OK
< Server: nginx/1.0.12
< Date: Fri, 06 Jul 2012 09:09:47 GMT
< Content-Type: text/html
< Transfer-Encoding: chunked
< Connection: keep-alive
< 
Segmentation fault

© Stack Overflow or respective owner

Related posts about c++

Related posts about curl