Entries tagged download

Simple wget crawler for list of files

Posted on 6. August 2015 Comments

This script could be helpful to download a set of files from a webserver, that you don’t have (S)FTP access to. The input file consists of a list of filenames, one name each line.


#!/bin/bash
file=$1
WEBSERVER="http://webserver.tld/folder/"
while IFS= read -r line; do
FULLURL="$WEBSERVER$line"
wget -nc -R --spider $FULLURL
done < "$file"

At first, the first command line argument is saved into the variable file. Then the Webserver address is saved to the WEBSERVER variable. IFS stands for Internal Field Separator. It’s used to read line by line through the file in the while loop that ends in the last line. Inside of the loop, the read line is concatenated with the webserver address into FULLURL. Then, wget is used with the parameters -nc for checking if the file is not already present in the current folder, -R for downloading and –spider for checking the existence on the webserver.

You can find the script on GitHub.

Automate selling at LaRedoute #1: Get new orders

Posted on 16. Dezember 2014 Comments

This blog post is part of the series Automize selling at LaRedoute.


The french marketplace LaRedoute unfortunately doesn’t have a real API, but they do have ways to automize some processes. A lot of smaller marketplaces have this concept as well. You will get credentials for an SFTP server. On this server you will find the folders ToSupplier and FromSupplier, where the „supplier“ (aka you) can up- and download a range files documented by Merchantry in their blog. The processing of the uploaded files can take up to 6 hours, but is sometimes done in only a couple of minutes, so I’m going to assume the worst case of 6 hours in this post.

While programming a couple of scripts I found the following problems:

  • the server is incredibly slow sometimes (better at nights), so sometimes the connections just time out
  • sometimes the listing for the ToSupplier folder times out because there are too many files (according to support…huh?), so they have to be deleted regularly
  • not only the connection to LaRedoute but also the connection to my local MySQL server times out
  • I have to reserve a purchased item once I accepted it on LaRedoute immediately, because it could be sold elsewhere in the 6 hours LaRedoute might take to give me the shipping address

New orders can be found in the ToSupplier folder in tab seperated CSV files (but .txt ending) with the format OrdersYYYY-MM-DD-hh-mm-ss.txt.

Since PHP is the companies main language I will show a couple of scripts which automize downloading and processing those files. The code is of course simplified for better understanding. We’re using SFTP instead of FTP and I found using the phpseclib to be the most usable library.

I will propose the use of 2 Tables in the MySQL database: TEMP-FILENAMES and FILENAMES-HISTORY. Both have a the unique column filename. FILENAMES-HISTORY will contain the name of every file ever processed by the following script, TEMP-FILENAMES is a helper table that will be truncated after every run.

First, we need to establish a connection


$sftp = new Net_SFTP(SFTP_LAREDOUTE_HOST);
if (!$sftp-&gt;login(SFTP_LAREDOUTE_USER, SFTP_LAREDOUTE_PASS)) {
exit('Login Failed');
}

Then we change directory. This is a command that usually involves listing the directory changed into, but since this is not a graphical client, the real timeout might come on line below. $nlist will just be null if the listing fails, and I will assume it didn’t work if it takes more than 30 seconds.

$sftp->chdir('/ToSupplier');

$beforetime = time();
$nlist = $sftp->nlist();
$aftertime = time();
if(($aftertime-$beforetime) > 30 ) {
exit('Timeout while Listing directory');
}

The next piece of code is only executed if the listing worked. Every filename that includes the word „Order“ is now inserted into the temporary table:

foreach($nlist as $filename) {
if (strpos($filename, 'Order') !== false) {
$qry = "INSERT INTO `TEMP-FILENAMES`(`filename`) VALUES ('". $filename . "')";
$insert = mysql_query($qry,MYSQLCONNECTION) or print mysql_error();
}
}

You can look at the difference between the filenames in your HISTORY table and the possibly new ones in the temporary table.

$tmpCmpFilenames = array();
$qry = "SELECT `filename` FROM `TEMP-FILENAMES` WHERE `filename` NOT IN (SELECT `filename` FROM `FILENAMES-HISTORY`)";
$select = mysql_query($qry, MYSQLCONNECTION) or print mysql_error();
while ($row = mysql_fetch_assoc($select)) {
$tmpCmpFilenames[] = $row['filename-id'];
}

Now we have all the new files in the array $tmpCmpFilenames. The correct way would be make sure the downloaded files are correct with hashes. Instead we decided to misuse the filesize, since it’s a good indicator something didn’t work properly;) The files not downloaded correctly are deleted from the array. They will appear next time the script is run.

foreach($tmpCmpFilenames as $filename) {
$remotefilesize = $sftp->size($filename);
$sftp->get($filename, 'OrdersFromLaRedoute/' . $filename);
$localfilesize = filesize('OrdersFromLaRedoute/' . $filename);
if ($remotefilesize != $localfilesize) {
unset($tmpCmpFilenames[$filename]);
}
}

We can now insert the filenames into the HISTORY table.

foreach($tmpCmpFilenames as $filename) {
$qry = "INSERT INTO `FILENAMES-HISTORY`(`filename`) VALUES ('". $filename . "')";
$insert = mysql_query($qry, MYSQLCONNECTION) or print mysql_error();
}

Last but not least, the temporary table needs to be truncated for the next run.

$truncate=mysql_query("TRUNCATE TABLE `TEMP-FILENAMES`",MYSQLCONNECTION) or print mysql_error();

The next step is described in part 2 of this series.

Beste Podcast App fuer Android

Posted on 26. September 2011 Comments

Ich habe mir einige angeschaut und bin von Podkicker restlos begeistert. WĂŒsste nichts, was die hĂ€tten besser machen können.

qrcode

Features:

  • Verwaltung und HinzufĂŒgen von Podcast-Feed via URL oder Empfehlungen
  • Neue Podcasts werden in einer Liste/Tab angezeigt, unabhĂ€ngig vom Feed, auch automatische nach einer bestimmten Zeit
  • Integrierter Player, der sich merkt, wo man zuletzt war(-> kein nerviges Suchen mehr)
  • Download und von dort aus HinzufĂŒgen in die Playlist
  • Löschen oder aus der Playlist nehmen, wenn der Podcast fertig gehört ist
  • Abspeichern/Laden von abonnierten Podcasts im OPML-Standard
  • RSS und Atom

 

update: inzwischen habe ich mir wegen der flattr-UnterstĂŒtzung die Pro-Version gekauft. Mit Android 2.2 funktioniert das aber leider nicht wegen irgendwelchen Zertifikaten. Da werd ich nochmal nachforschen. DafĂŒr spielt der Podcast weiter, wenn man den Stecker rausgezogen hat und ihn dann wieder reinsteckt, sehr praktisch!

update2: da nach einer Weile die flattr UnterstĂŒtzung aufgrund API Änderungen auch bei Podkicker Pro unter Android 4.4 kaputt war, habe ich zu AntennaPod gewechselt, die inzwischen eine sehr brauchbare und schlichte OberflĂ€che erstellt haben. Es gibt außerdem ebenfalls eine flattr UnterstĂŒtzung und es ist Open Source.