# Download lists for dial up users

## fghellar

Edit (13/February/2003)

Important! Check the update below:

https://forums.gentoo.org/viewtopic.php?p=206547#206547

---------- Changelog ----------

(25/September/2002):

To make it work with Portage 2.0.37, check MoonWalker's patch.

(13/July/2002):

The new script is there now.

(12/July/2002):

Warning: this script does not work with newer versions of portage, due to changes made to the format and name of the dependency file. I'll hopefully have it fixed by tomorrow.

---------- End Changelog ----------

I wrote an awk script to generate a list of files to be downloaded based on the output of emerge -p. It is intended to help those who don't have a fast Internet connection dowload their files elsewhere.

Here is the script:

```
# initial setup

BEGIN      {

            # set default download mirror (without a trailing "/")

            gentoo_mirror = "http://www.ibiblio.org/gentoo"

            

            # initialize URL counter

            total_urls = 0

         }

# match only lines containing the word "ebuild"

/ebuild/   {

            # the original string is already splitted using the default separator (blank space)

            # the 4th element consists of "category/package" and will be further splitted

            split($4, categ_pack, "/")

            # determine the base name of the package

            split(categ_pack[2], pack_base, /-[0-9]/)

            # form the "ebuild package depend" command

            ebuild_cmd = "ebuild " "/usr/portage/" categ_pack[1] "/" pack_base[1] "/" categ_pack[2] ".ebuild" " depend"

            # form the dependency file name

            dep_file_name = "/var/cache/edb/dep/" "dep-" categ_pack[2] ".ebuild"

            # run ebuild depend

            print ebuild_cmd | "sh"

            close("sh")

            # read the generated dependency file

            getline dep_file < dep_file_name

            # split its contents using "'" as the separator

            split(dep_file, dep_file_cont, "'")

            # the 8th field contains the URLs, separated by spaces

            num_urls = split(dep_file_cont[8], urls, " ")

            # process each URL in turn

            for (n = 1; n <= num_urls; n++) {

               # determine the name of the file to be downloaded

               # splitting the URL using "/", the file name is the last element

               file_name_index = split(urls[n], file_name, "/")

               # increment URL counter

               total_urls++

               # add mirror URL

               mirror_urls[total_urls] = gentoo_mirror "/distfiles/" file_name[file_name_index]

               # add download URL

               download_urls[total_urls] = urls[n]

            }

         }

# output URLs

END         {

            # mirror URLs go first

            for (n = 1; n <= total_urls; n++) {

               print mirror_urls[n]

            }

            # and then the others

            for (n = 1; n <= total_urls; n++) {

               # don't show URLs starting with "mirror://"

               if (index(download_urls[n], "mirror://") ~ 0)

                  print download_urls[n]

            }

         }
```

And here is how to use it:

1. Save the above script as a text file named gen_dl_list.awk (or whatever pleases you).

2. Run emerge -p and process its output with the script. This can be done in several ways. Some of them are:

2.1

```
# emerge -p xfree | awk -f gen_dl_list.awk
```

2.2

```
# emerge -p xfree > tempfile

# awk -f gen_dl_list.awk tempfile
```

2.3

```
# emerge -p xfree | awk -f gen_dl_list.awk > download_list
```

2.4

```
# emerge -p xfree > tempfile

# awk -f gen_dl_list.awk tempfile > download_list
```

Any of the above will output a list of files to be downloaded. (2.1) and (2.2) will output the list to the screen (not very useful). (2.3) and (2.4) will save the list in the file download_list (recommended).

3. Take the generated list (download_list, or whatever you've called it) to any place where you have access to a fast Internet connection and get the files with wget:

```
# wget -c -i download_list
```

If you have problems with ftp sites, try this:

```
# wget --passive-ftp -c -i download_list
```

In case you need, there's also wget for windows.

4. Take the downloaded files back to your computer (via CD-Rs, Zip disks, hard disks, whatever) and put them in /usr/portage/distfiles. Now you can run

```
# emerge xfree
```

or

```
# time emerge xfree
```

without even needing to be connected to the Internet.

Known problems/limitations:

1. This was my first experience with awk. I wrote this script using only Daniel Robbin's Awk by example articles and the GNU Awk User's Guide. This means it is far away from being perfect. Any good programmer out there can, and is encouraged to, make it much better.

2. The way it is implemented now, the generated list can (and does) contain several entries for the same file. Fortunately, wget is smart enough to download each file only once.

3. I don't know how to handle the new "mirror://" syntax. I just left it out.

Well, I guess that's all for now... Anything constructive is welcome.  :Smile: 

----------

## zen_guerrilla

U saved my life  :Smile: . Seriously this was something I really needed since I fetch files on work. My addition is :

```

# cp gen_dl_list.awk /usr/local/bin

```

save the following bash script as /usr/local/bin/get_list :

```

#!/bin/sh

if [ -e tempfile ]; then

   {   

   awk -f /usr/local/bin/gen_dl_list.awk tempfile > download_list

   rm -i tempfile

   echo "Now run : wget -c -i download_list"

   echo "or :      wget --passive-ftp -c -i download_list"

   echo "to fetch the files"

   }

else

   {

   echo "tempfile not found !"   

   echo "Please run : "

   echo "emerge -p something > tempfile"

   echo "before running this script"

   }

fi

```

```

# chmod +x /usr/local/bin/get_list

```

the paths are just a suggestion, they must in root's PATH though...

I hope this makes life even easier  :Smile: 

----------

## delta407

mirror://sourceforge allows Portage to download a package from the sourceforge.net mirrors... to translate them into valid URLs, you could pipe your output stream to something like this:

```
sed -e 's/mirror:\/\/sourceforge/http:\/\/$SFMIRROR.dl.sourceforge.net\/sourceforge/'
```

...where $SFMIRROR is exported as one of the valid sourceforge.net download mirrors, such as "unc" (ibiblio), "telia", or "belnet".

----------

## fghellar

New version, totally rewritten (now in bash), much better and much easier:

EDIT: The script was updated on 14/July/2002. If you got it before this date, replace it with this one.

```
#!/bin/bash

# set your defaults here:

user_defs() {

   

   # portage directory (without a trailing "/"):

   portage_dir="/usr/portage"

   

   # default download mirror (without a trailing "/"):

   gentoo_mirror="http://www.ibiblio.org/pub/Linux/distributions/gentoo"

   

   # default sourceforge mirror (unc, telia, belnet):

   sourceforge_mirror="unc"

   

}

#------------------------------------------------------------

# function to remove temporary files

cleanup() {

   

   rm -f $temp_file_1 $temp_file_2

   exit $1

   

}

# set user defaults

user_defs

# set the complete url for the sourceforge mirror

# (the \'s are needed because this goes in a sed command)

sourceforge_mirror_complete="http:\/\/$sourceforge_mirror.dl.sourceforge.net\/sourceforge"

# initialize counters

num_files=0

num_alt_urls=0

total_size=0

# initialize lists (arrays)

declare -a def_urls_arr

declare -a alt_urls_arr

# create 2 temporary files

temp_file_1=`mktemp -t dl-list.XXXXXX` || cleanup 1

temp_file_2=`mktemp -t dl-list.XXXXXX` || cleanup 1

# run "emerge -p <args>" (too easy to forget the "-p" in the command line...)

emerge -p $@ > $temp_file_1 || cleanup 1

# remove the lines that do not contain the word "ebuild"

sed -n -e '/ebuild/p' $temp_file_1 > $temp_file_2

# count how many lines were left

num_ebuilds=`wc -l $temp_file_2 | sed -e 's/\(.*\) \(.*\)/\1/'`

# extract the useful information from those lines: category, package and version

sed -e 's:\(.*\) \(.*\)/\(.*\)-\([0-9].*\) \(.*\) \(.*\):\2 \3 \4:' $temp_file_2 > $temp_file_1

# display starting message :)

echo -n "Generating list " >&2

# process each package in turn

while read category package version

do

   

   # form the name of the digest file

   digest_file="$portage_dir/$category/$package/files/digest-$package-$version"

   

   # process the contents of the digest file

   while read md5_flag md5_sum file_name file_size

   do

      

      # form the default url to download the file

      def_urls_arr[$num_files]="$gentoo_mirror/distfiles/$file_name"

      

      # increment the file counter

      num_files=$(($num_files + 1))

      

      # update the size accumulator (in kilobytes)

      total_size=$(($total_size + $file_size / 1024))

      

   done < $digest_file

   

   # form the "ebuild depend" command line

   ebuild_depend_cmd="ebuild $portage_dir/$category/$package/$package-${version}.ebuild depend"

   

   # execute the "ebuild depend" command

   $ebuild_depend_cmd || cleanup 1

   

   # form the name of the dependency file

   dependency_file="/var/cache/edb/dep/$category/$package-$version"

   

   # read in the 4th line from the dependency file,

   # which contains the official download urls

   alt_urls=`head -n 4 $dependency_file | tail -n 1`

   

   # ignore empty url list

   if [ -n "$alt_urls" ]

   then

      

      # split the urls list into $1..$N

      set $alt_urls

      

      # process each url in turn

      for i in $@

      do

         

         # remove the <use>? strings from the url list

         alt_url_tmp=`echo "$i" | sed -e '/\?$/d'`

         

         # remove the "mirror://gnome" urls

         alt_url_tmp=`echo "$alt_url_tmp" | sed -e '/^mirror:\/\/gnome/d'`

         

         # remove the "mirror://kde" urls

         alt_url_tmp=`echo "$alt_url_tmp" | sed -e '/^mirror:\/\/kde/d'`

         

         # remove the "mirror://gentoo" urls (already included)

         alt_url_tmp=`echo "$alt_url_tmp" | sed -e '/^mirror:\/\/gentoo/d'`

         

         # translate the "mirror://sourceforge" urls into valid urls

         alt_url_tmp=`echo "$alt_url_tmp" | sed -e "s/mirror:\/\/sourceforge/$sourceforge_mirror_complete/"`

         

         # ignore empty urls

         if [ -n "$alt_url_tmp" ]

         then

            

            # add the url to the list

            alt_urls_arr[$num_alt_urls]=$alt_url_tmp

            

            # increment the alternate url counter

            num_alt_urls=$(($num_alt_urls + 1))

            

         fi

         

      done

      

   fi

   

   # a progress bar :)

   echo -n "." >&2

   

done < $temp_file_1

# display ending message :)

echo " done." >&2

# display default urls list

for i in ${def_urls_arr[@]}; do echo $i; done | sort

# display alternate urls list

for i in ${alt_urls_arr[@]}; do echo $i; done | sort

# display totals

echo "Totals:" $num_ebuilds "ebuilds," $num_files "files," $num_files "default urls," $num_alt_urls "alternate urls," "${total_size}Kb." >&2

# remove temporary files and exit

cleanup 0
```

And the instructions:

1. Save the script as e.g. /usr/sbin/dl-list

2. Make it executable:

```
# chmod +x /usr/sbin/dl-list
```

3. Use it!

```
# emerge -p kde

# dl-list kde

# dl-list kde > kde.dl

# wget --passive-ftp -c -i kde.dl
```

```
# emerge -p -u kde

# dl-list -u kde

# dl-list -u kde > kde.dl

# wget --passive-ftp -c -i kde.dl
```

```
# emerge -p -u kde

# dl-list -p -u kde

# dl-list -p -u kde > kde.dl

# wget --passive-ftp -c -i kde.dl
```

Comments:

1. The script now runs emerge internally, so you don't need to run emerge -p before. In fact, you can run it exactly as you would run emerge. You can even specify several packages or specific versions for packages. The script adds -p automatically, in case you forget.  :Wink: 

2. mirror://sourceforge is now handled correctly, but mirror://gnome is just left out...

2.1 Update: mirror://kde is also left out.

3. The script now displays a progress bar (sort of) and some statistics.  :Wink: 

4. The rest is basically the same.

I'd like to thank everyone for their input, and, as before, anything constructive is welcome.  :Smile: 

----------

## delta407

 *fghellar wrote:*   

> 2. mirror://sourceforge is now handled correctly, but mirror://gnome is just left out...

 

As is mirror://kde...

----------

## fghellar

 *delta407 wrote:*   

> As is mirror://kde...

 

It actually wasn't left out, nor handled correctly. I updated the script to leave it out too...

----------

## Pavan

Thanks a lot. Its amazing!!!   :Very Happy: 

-Pavan

----------

## Buster

That's great!

Thx, a lot!  :Very Happy: 

----------

## MoonWalker

Not sure if anyone "watch" this threed still but... I try this script and get following result:

```
#dl-list kde

Generating list /usr/sbin/dl-list: $digest_file: ambiguous redirect

!!! doebuild: /usr/portage/[ebuild/N/N-] not found.
```

I'm really bad on sh but it looks to me as something is out of line regarding how it extract the info returned by "emerge -p kde"

I run this on 1.4rc1, which may is the problem as it seam to have been working before. Anyone have a clue?

Joakim

----------

## pjp

I just installed the script and ran 'dl-list kde' and it generated a list.  What version of portage are you using?

----------

## MoonWalker

 *Quote:*   

> I just installed the script and ran 'dl-list kde' and it generated a list. What version of portage are you using?
> 
> 

 

I run 2.0.37 of portage, the latest afai understand. My system is a P4 originally 1.3b upgraded with the official upgrade scripts step 1 to 4 to 1.4rc1 and after this I have run a 

```
# emerge -e world
```

without any errors so I can't really figure what is going wrong here. I have no experience of bash shell scripts so not sure were it goes wrong, but some simple tries of debuging shows it seam to be the part

```
# extract the useful information from those lines: category, package and version 

sed -e 's:\(.*\) \(.*\)/\(.*\)-\([0-9].*\) \(.*\) \(.*\):\2 \3 \4:' $temp_file_2 > $temp_file_1 

```

not doing the job properly, but I'm not sure about this.

EDIT:

Not sure if this matter, but after upgrade to 1.4 there was actually one problem, my keymap had changes from "se-latin1" to "us" so I changed this back in rc.conf to get keyboard work as before - maybe I need to do more in this regard?

Joakim

----------

## zen_guerrilla

The script doesn't work on portage 2.0.37. It worked for me on 1.4rc1 with previous versions of portage. So I guess it needs an upgrade  :Smile: 

.:: zen ::.

----------

## MoonWalker

 *zen_guerrilla wrote:*   

> The script doesn't work on portage 2.0.37. It worked for me on 1.4rc1 with previous versions of portage. So I guess it needs an upgrade  

 

Ok thanks, did you "hear" that fghellar? Will there be an upgrade within reasonable time or someone else capable of doing it? I guess this is pretty simple stuff, if you know how to write regexp and shell scripts... 

joakim

----------

## pjp

I'll send fghellar a PM.  He's been busy lately, so he may not see it.  May not have time to update it either.  Honestly though, Portage 2.0.37 is buggy right now.  I wouldn't expect any updates until some new features are working 100%.

----------

## MoonWalker

I went on and stuck my noze deeper into this reading some docs... and figure out what differs with portage 2.0.37 is it don't add "to /" (or something like it) to end of row. Dispite my absolute lack of knowledge in regular expressions I managed to figure what represented this in

```
# extract the useful information from those lines: category, package and version 

sed -e 's:\(.*\) \(.*\)/\(.*\)-\([0-9].*\) \(.*\) \(.*\):\2 \3 \4:' $temp_file_2 > $temp_file_1 
```

so making it

```
# extract the useful information from those lines: category, package and version 

sed -e 's:\(.*\) \(.*\)/\(.*\)-\([0-9].*\):\2 \3 \4:' $temp_file_2 > $temp_file_1 
```

made it generate the list. So it was a quite simple fix as I assumed, even so simple I could manage it myslef! Isn't life fantastic  :Smile: 

----------

## fghellar

 *MoonWalker wrote:*   

> So it was a quite simple fix as I assumed, even so simple I could manage it myslef!

 

Thanks for patching it!  :Wink: 

I'll work on a new, improved version as soon as I get the time...

----------

## pompe

Hello!

I tried the script with the patch with the latest portage (2.40? don't remember), but with some packages like kde and gnome it doesn't work.

The message I got is something like this '!!! doebuild: /usr/portage/ *the adress for the ebuild*'. I tried some debugging but i don't understand  sed at all. But I think it is the packeges were ebuild add some extra information after the packeges name like this '[0.2.43]'. That text is also saved in the $temp_file_2 textfile just before the loop:

# process each package in turn 

while read category package version 

do 

If some smart person know how to fix the script please do so.

Bye /Pontelonten

----------

## fghellar

 *pompe wrote:*   

> But I think it is the packeges were ebuild add some extra information after the packeges name like this '[0.2.43]'. That text is also saved in the $temp_file_2 textfile just before the loop:
> 
> # process each package in turn 
> 
> while read category package version 
> ...

 

Thanks for your feedbak.

Here is a quick hack for it:

Change this:

```
# process each package in turn

while read category package version

do
```

Into this:

```
# process each package in turn

while read category package version rest

do
```

This should make it work for now.

There's another issue, in that it may output some garbage along with the urls. I'm short of time right now to debug it, but I hope I'll be able to look at it this weekend.

----------

## fghellar

Ok, folks, here are some good news and some bad news for you!

Bad news #1: This donwload-list-generator script will no longer be supported, and will probably also be removed from here in the future.

Good news #1: Portage has now (since version 2.0.40) incorporated this feature, via emerge -p -f <package>.

Bad news #2: The output of emerge -p -f <package> may contain some garbage URLs sometimes.

Good news #2: This is easy to fix!

Here's what you need to do: after running emerge -p <package(s)> and making sure that that's what you want, run

```
emerge -p -f <package(s)> | xargs -n1 echo | grep '://'
```

In case you don't have xargs available, or if for any other reason you don't want to use it, this also works:

```
for i in `emerge -p -f <package(s)>`; do echo $i; done | grep '://'
```

To save the list to a file, just do

```
emerge -p -f <package(s)> | xargs -n1 echo | grep '://' > download_list
```

or

```
for i in `emerge -p -f <package(s)>`; do echo $i; done | grep '://' > download_list
```

Unfortunately, this list can contain some garbage URLs, due to the fact that the use() function (defined in /usr/sbin/ebuild.sh and used in the ebuilds) generates output to the screen. Fortunately, it's very easy to "fix" this without causing any harm. Here's how:

Open the file /usr/sbin/ebuild.sh and scroll down until you find the definition of the use() function. It should look like this:

```
use() {

   local x

   for x in ${USE}

   do

      if [ "${x}" = "${1}" ]

      then

         echo "${x}"

         return 0

      fi

   done

   return 1

}
```

To "fix" it, just comment out that 'echo "${x}"' line, making it look like this:

```
use() {

   local x

   for x in ${USE}

   do

      if [ "${x}" = "${1}" ]

      then

         # echo "${x}"

         return 0

      fi

   done

   return 1

}
```

It should now work as expected. (Note: you'll have to edit this file again whenever you update/reinstall portage.)

----------

## peterg

knocked up a simple script to download distfiles required for a package, then realised someone else already made one  :Smile: 

might be of some use to someone

basic usage is 

```
 dl-list <package> 
```

eg dl-list kde

modify the SERVER variable to the mirror you wish to use, and DIR to your distfiles directory

bash script is as follows

```

#!/bin/bash

SERVER="ftp://mirror.internode.on.net/pub/gentoo/distfiles"

DIR="/usr/portage/distfiles"

FILELIST=`emerge -fpv $1 | cut -d " " -f 1 |grep http`

cd $DIR

for FILE in $FILELIST; do

        FILE=`echo $FILE | sed s/\\\//\ /g`

        COUNT=`echo $FILE | wc -w`

        FILE=`echo $FILE | cut -d " " -f $COUNT`

        wget -N -P $DIR $SERVER/$FILE

done

```

after that you can obviously just run emerge <package> to compile the package

cheers

Pete

----------

## Rem

Hi,

I'm not sure if I'm doing anything wrong, but it seems that from Portage-2.0.49 my list won't write to a file anymore.

With the command:

```
emerge -p -f <package(s)> | xargs -n1 echo | grep '://' > download_list
```

I usually (with portage-2.0.48-r5) got the list of the files to download in the file download_list, but with the newer portage I get these output written to my screen, and it generates an empty download_list file.

Anybody has the same experience? I've been trying with this quite sometime now, maybe I'm overlooking something.

Rem

----------

## Sudrien

small modification passed on to me a little while ago... slightly different pipe.

```
emerge -p -f <package(s)> | xargs -n1 echo | grep '://' 2> download_list
```

the 2> 

-Sud.

----------

## Rem

Doesn't seem to work here. What does the 2 do? I thought > was the sign for saving the output to a file. Which version of portage do you run? I run portage-2.0.49-r15.

Rem

----------

## fghellar

It looks like emerge -p -f is now printing the URLs to stderr instead of stdout. You need to redirect stderr to stdout, otherwise the URLs won't get passed through the pipe to xargs.

So, what you need is

```
emerge -p -f <package(s)> 2>&1 | xargs -n1 echo | grep '://' > download_list
```

----------

## Rem

Works like a charm, thanks!

Rem

----------

## MADcow

fghellar: how can i get a COMPLETE list -- including dependencies?

i'm trying emerge -p -f.

using -D doesn't give me a complete list... but i thought it might...

i have portage-2.0.50-r1

thanks!

----------

## fghellar

 *MADcow wrote:*   

> fghellar: how can i get a COMPLETE list -- including dependencies?
> 
> i'm trying emerge -p -f.

 

You can try with emerge -p -f -e.

[]'s

----------

## MADcow

thankee

----------

