# HOWTO:Download Cache for your LAN-Http-Replicator (ver 3.0)

## flybynite

!!!!!!!!!!!HTTP-REPLICATOR IS NOW IN PORTAGE!!!!!!!!

http://packages.gentoo.org/ebuilds/?http-replicator-3.0

This thread is long, but please read the howto on the first page and then check the last page or two for updates...

Have A Couple (or Couple Hundred) Boxes On A LAN?

Be a good Gentoo Netizen and Speed the updates with Http-Replicator!  Http-Replicator is a proxy that works with Portage to serve packages from a cache.  It saves you bandwidth and helps gentoo grow by minimizing the mirror abuse.  The cache download speed is limited only by your disk and LAN Speeds!

Here is how it works.  Http-Replicator is running on one of your machines and listens for connections from other gentoo boxes.  If the package is in the cache, it sends it out at LAN speeds!  If not in the cache, http-replicator will simultaneously download the file, and stream the file to multiple clients!  No matter how many machines request the package, only one copy comes down the internet pipe.  Multiple copies can stream out the LAN pipe.

This is the easiest, and most reliable way to share both source and binary packages.

Http-Replicator has been designed with speed and security in mind.  Give it a try!!

1. Emerge http-replicator.

****If upgrading http-replicator installed from portage overlay, remove the overlay files and unmerge http-replicator first.  Don't worry, your config files will not be touched and will work fine with the new official ebuild...

```

# emerge http-replicator

```

2.  Modify /etc/make.conf on both the server and your other gentoo boxes.

Add the "http_proxy" to /etc/make.conf:

```

http_proxy="http://YourProxyHere.com:8080" 

```

replacing YourProxyHere.com with the hostname or IP address of the box running http-replicator .

NOTE: Http-Replicator 3.0 no longer needs any special RESUMECOMMAND!!  Please comment out (place # at the start of the line) the previous RESUMECOMMAND changes if upgrading from a version prior to 3.0!!

3.  Check the config file /etc/conf.d/http-replicator and then run repcacheman to create the cache dir and transfer files to http-replicators cache.

Most people can just run 'repcacheman' which will create the default cache dir /var/cache/http-replicator and complete the setup.   You can change the defaults if you have special needs.  repcacheman will set up the cache dir and user according to the /etc/conf.d/http-replicator config file and doesn't need any special command line.

```

/usr/bin/repcacheman

```

NOTE: I created repcacheman to automate the install and maintenance of Http-Replicator's cache dir.  If repcacheman doesn't work for you for some reason, send me the bug reports.  Http-replicator will work without repcacheman, but you'll have to create the cache dir and chown the dir  to complete the install.  repcacheman also checksums your existing /usr/portage/distfiles before moving them to the cache dir.  Portage leaves incomplete and corrupt files in the distfile directory and repcacheman will not move those files to the cache.

4. Next, start Http-Replicator on the server:

```

/etc/init.d/http-replicator start

```

5.  You should add http-replicator to your default runlevel

```

rc-update add http-replicator default

```

Don't forget that portage needs mirrors!  Edit GENTOO_MIRRORS in /etc/make.conf to add more http mirrors and place any ftp mirrors LAST.  The default mirrors in gentoo leave something to be desired  :Smile:   Use mirrorselect if you need help in selecting mirrors.  

Also, some packages in portage have a RESTRICT="nomirror" option which will prevent portage from checking replicator for those packages.  The following will override this behavior.  Create the file "/etc/portage/mirrors" containing:

```

# Http-Replicator Override for FTP and RESTRICT="nomirror packages

local http://gentoo.osuosl.org

```

You can replace gento.osuosl.org with your favorite HTTP:// mirror.  If you already have a local setting, don't worry, as long as it is an http mirror this will still be effective.

Then update something and watch http-replicator go!!  It doesn't matter which box or how many boxes you update at the same time!!  Http-Replicator can handle it!!

I recommend running repcacheman after running emerge's on the server (except emerge sync).  It will delete duplicates after the server box fetches any files and also import FTP'd or other files to the cache.  The easiest way to do this is to run emerge's on the server like this:

```

emerge -uDva world && repcacheman

```

This runs the emerge then repcacheman when the emerge is complete.

To keep repcacheman fast and efficient, you should consider deleting any files that remain in your distfile directory after repcacheman runs.  They are either:

1.  No longer in Portage

2. Incomplete or corrupt

3.  Just plain junk

Http-Replicator will serve binary file to clients!!  Build once and use binary packages for the rest of your gentoo herd!!  Great for mass installs!!  Other http or ftp daemons not required!!

This needs a whole HOWTO but here is the quick version..

Add -b to your emerges on the server or set features buildpkg in /etc/make.conf.  This will make a binary package after compiling.  Then just set your clients PORTAGE_BINHOST (/etc/make.conf) to point to http://YourProxyHere:8080/All for a default gentoo install.

Example: a new install of xmms

```

emerge -vab xmms

```

Then run an emerge with the binary option -g on a client (see man portage for more options)

```

emerge -uvag xmms

```

-g will cause portage to use binary packages from the server if available, and compile if not on the server.  Huge time saver for similar machines!!!

You don't have to re-emerge packages to make binaries.

```

quickpkg xmms

```

will create a binary package for the already installed xmms.  quickpkg will use the installed config files for the package so any custom config's will be part of the new package!!

Version 1.9

No longer masked in portage

Version 1.8

Now an official package!!

http://packages.gentoo.org/ebuilds/?http-replicator-3.0

Version 1.7

http-replicator 3.0!!

Deleted reference to old versions 

Added BINHOST info

Version 1.6c

Added note to delete old ebuilds for upgrades

Version 1.6b

Updated to any http://mirror and clarify conf

Version 1.6a

Updated http://gentoo.osuosl.org/ link

Version 1.6

Added repcacheman example

Some changes for 2.1

Version 1.5

Clarify repcacheman note

Version 1.4

Added repcacheman note

added /etc/portage/mirrors note

Version 1.3

Added repcacheman

Changed default cache dir

Version 1.2

Updated RESUMECOMMAND

Version 1.1

Added group permissions on /usr/portage/distfiles

Added mirror reminder

Simplify activation

Version 1.0Last edited by flybynite on Sat Jan 28, 2006 3:21 am; edited 37 times in total

----------

## Darkaxe

Hi flybynite,

   This ebuild should definately be in the official portage list of ebuilds.

  It works a treat, is easy to setup and fixes a complicated (or people have made it complicated) problem. I think this is what most people are after instead of the other 5 or 6 solutions I've found on the forums.   

Just a couple of questions for you tho...

1- Senario, I have 2 workstations and a little server. The Http-Replicator server is running on my beefy workstation where most of the packages are downloaded to. If I emerge a package onto the server and the workstation didn't have it, does that mean if I go to emerge the same package onto the workstation I have to download it again? Or will the package be stored on the workstation too?

2- Is there a way to automatically bypass the Http-Replicator if the source for the emerge is on an ftp server? Currently the emerge will fail if the only sources for the ebuild are on ftp servers (ie. kdebase-3.2.2.tar.bz2, kdegraphics-3.2.2.tar.bz2. The rest of the kde stuff is OK.) It comes up with an error about the proxy port is invalid...

   Thanks   :Smile: 

----------

## flybynite

 *Darkaxe wrote:*   

> Hi flybynite,
> 
>    This ebuild should definately be in the official portage list of ebuilds.
> 
>   It works a treat, is easy to setup and fixes a complicated (or people have made it complicated) problem. I think this is what most people are after instead of the other 5 or 6 solutions I've found on the forums.   
> ...

 

Submitted: https://bugs.gentoo.org/show_bug.cgi?id=50872  Its just that gentoo devs seem overloaded lately  :Smile: 

 *Darkaxe wrote:*   

> 
> 
> Just a couple of questions for you tho...
> 
> 1- Senario, I have 2 workstations and a little server. The Http-Replicator server is running on my beefy workstation where most of the packages are downloaded to. If I emerge a package onto the server and the workstation didn't have it, does that mean if I go to emerge the same package onto the workstation I have to download it again? Or will the package be stored on the workstation too?
> ...

 

No package is ever downloaded twice, no matter who starts the download or when the download is started  :Smile:   Start the same emerge on all 3 boxes at once and only 1 copy is downloaded while all 3 boxes receive the file from http-replicator!!  ( one exception, that is ftp mirrors aren't supported by HTTP-replicator  :Sad:  but there are plenty of good http mirrors!   )

 *Darkaxe wrote:*   

> 
> 
> 2- Is there a way to automatically bypass the Http-Replicator if the source for the emerge is on an ftp server? Currently the emerge will fail if the only sources for the ebuild are on ftp servers (ie. kdebase-3.2.2.tar.bz2, kdegraphics-3.2.2.tar.bz2. The rest of the kde stuff is OK.) It comes up with an error about the proxy port is invalid...
> 
>    Thanks  

 

This happens automatically in my setup directions.  You must have an ftp_proxy set some where in your environment leftover from another program, check for this.  If you did try and set http-replicator as an ftp_proxy, the error would be different!  You would get an ERROR -1: No data received, not proxy port invalid.

In http-replicator, we only set an http_proxy, not an ftp_proxy, so wget should not use the proxy for ftp transfers. 

Make sure you have plenty of good http mirrors in GENTOO_MIRRORS in /etc/make.conf, not really for http-replictor, just for gentoo in general.  Use mirrorselect if you need help selecting more mirrors.  The default mirrors in gentoo aren't the best.  I got kdebase etc from gentoo.oregonstate.edu when the ebuilds first came out!

----------

## ja

Great ebuild  :Smile: 

 *Darkaxe wrote:*   

> 
> 
> 2- Is there a way to automatically bypass the Http-Replicator if the source for the emerge is on an ftp server? Currently the emerge will fail if the only sources for the ebuild are on ftp servers (ie. kdebase-3.2.2.tar.bz2, kdegraphics-3.2.2.tar.bz2. The rest of the kde stuff is OK.) It comes up with an error about the proxy port is invalid...
> 
>    Thanks  

 

Yes some special ebuilds depend on ftp servers.

Replace:

```
# Default fetch command (5 tries, passive ftp for firewall compatibility)

 PROXY="http_proxy=http://YourMirrorHere.com:8080"

 FETCHCOMMAND="$PROXY /usr/bin/wget -t 5  \${URI} -P \${DISTDIR}"

 RESUMECOMMAND="$PROXY /usr/bin/wget -t 5  \${URI} -P \${DISTDIR}" 
```

with: 

```
# Default fetch command (5 tries, passive ftp for firewall compatibility)

http_proxy="http://YourLANproxyHere.com:8080"

FETCHCOMMAND="/usr/bin/wget -t 5  \${URI} -P \${DISTDIR}"

RESUMECOMMAND="/usr/bin/wget -t 5  \${URI} -P \${DISTDIR}" 
```

and it works perfect, without using the proxy for ftp transfers.

Remember to only have http mirrors in your make.conf:

```

GENTOO_MIRRORS="http://xyz.com/gentoo http://xyz2.com/gentooo"

```

And:

I think you have to have an extra directory on your proxy for the files, 

not the distfiles dir because portage and the replicator will save partial files there.  A little bit confusing for both of them.

After emerging, at least in my System, the proxy didn't cache anything because "distfiles" didn't had the right permissions, too. 

So i changed the dir in /etc/http-replicator.conf.

----------

## flybynite

Ja, 

The changes you made to make.conf are more simple, so I've changed the howto.  Thanks !  Actually you don't even have to uncomment the FETCHCOMMAND, as that is the default anyway, only the RESUMECOMMAND is non standard.  Neither the old or new way should cause ftp requests to go through http-replicator though.  Ftp works just fine on my system following the howto.

 I also added changes to the group/permissions to make it clear that http-replicator can write files in that directory.  I don't know if some installs are different or what but this ensures everything works just fine.

Http-Replicator uses a special tmp file so there is no conflict between portage and http-replicator using the same directory.  There is nothing wrong with separate directories though,  using the same directory just makes things simple.  It also means that if you have to use ftp to get some special file on the server, replicator can still cache that file to other machines using http!

----------

## ja

After *heavy* interrupting (using strg+c) wget multiple times during transfers with the same files on the client and server my client started creating .1-xx files and emerge failed. I've used --fetchonly for testing.

It seems the Replicator started serving incomplete tmp files as complete files, wget recognized them as new files and starts renaming...

I havn't seen this with seperate directorys.

I used two machines 1 machine running the replicator and a 2nd one as a client.

For the proxy thing:

man make.conf says:

"Either define  PROXY  or  PROXY_FTP and PROXY_HTTP."

I think with your old howto you configured the Replicator as a proxy for both protocols. Don't know why it was working for you.

----------

## flybynite

ja, some good work my friend!!

Version 1.2 is now in the works.  Change your RESUMECOMMAND everyone.

Seems I got the results of removing the -c option wrong when a rabid user kills the emerge multiple times  :Smile: 

It was emerge leaving those .1XX files because of my removing the -c option.  Http-Replicator was never serving incomplete files!!

I'm hoping Http-Replicator should still work without problems in the same directory as portage with the new RESUMECOMMAND. 

I see one more bug to fix.  Till I get that squashed, quitting the fetch might require removing the partial file download.

Keep those bugs coming!!

----------

## wizard69

@flybynite THX for the howto and the great app sounds just like what i have been looking for for my home net. Is it worth waiting for the new version or is it safe to emerge now?

----------

## flybynite

Replicator works very well in normal use.  The issue is a possible annoyance, not a safety or security issue at all.

The thing is I would like replicator to be able to use /usr/portage/distfiles as a cache.  Only on the server replicator and portage must play well using the same directory.  The issue is when you interrupt a fetch on the server, Portage leaves an incomplete file and replicator doesn't know it is incomplete.

If for some reason you interrupt a portage fetch on the server, you might have to manually delete the incomplete file, thats all.

I use replicator every day and haven't had any problems in normal use because I don't interrupt portage downloads.

If you are paranoid, change the cache directory in /etc/http-replicator.conf to another directory.  Then just move all the files from distfiles to the new directory to prime the cache.  This will avoid the problem till I test some fixes.

----------

## flybynite

Please help test!!

While http-replicator itself hasn't changed, my install instructions and the repcacheman script is new.

This should fix the minor issues raised earlier.

----------

## wizard69

@flybynite Okay i willl give it a try an post the results

Okay i emerged http Replicator and followed your setup instructions for my Gentoo Server and one Gentoo client. But i am not quite sure if http Replicator is being used. How can i check on this because it  doesn't seem to leave any entries in the log file.

----------

## hammerhai

Executing repcacheman doesn't work for me:

```
Checking authenticity and integrity of new files...

Searching for ebuilds's ....

Done!

Found 13954 ebuilds.

Extracting the checksums....

Missing digest: dev-libs/dvxml-0.1.4

Missing digest: dev-util/jconfig-2.5

Missing digest: media-plugins/tap-plugins-0.1

Missing digest: net-analyzer/authforce-0.9.6

Traceback (most recent call last):

  File "/usr/bin/repcacheman", line 157, in ?

    digestpath = os.path.dirname(digestpath)+"/files/digest-"+pv

  File "/usr/lib/python2.3/posixpath.py", line 119, in dirname

    return split(p)[0]

  File "/usr/lib/python2.3/posixpath.py", line 77, in split

    i = p.rfind('/') + 1

AttributeError: 'NoneType' object has no attribute 'rfind'

```

----------

## wizard69

have you tried to remerge python

----------

## hammerhai

Yes, but there is still the same error.

----------

## flybynite

 *wizard69 wrote:*   

> @flybynite Okay i willl give it a try an post the results
> 
> (snip)
> 
>  How can i check on this because it  doesn't seem to leave any entries in the log file.

 

The log file is written lazily, so it might take a while to flush to disk.  You can tell its working by looking at the emerge.  You'll see the port 8080 in the emerge - something like this:

```

Connecting to gate1.homenet.com[127.0.0.1]:8080... connected.

```

Of course, you'll see a download speed of ~ 11MB/s on cached files!

----------

## flybynite

 *hammerhai wrote:*   

> Executing repcacheman doesn't work for me:
> 
> ```
> 
> (snip)
> ...

 

I should have made it obvious that repcacheman is not strictly needed.  Http-Replicator will work fine without it.  Right now, any new emerge's you do will be cached!!

I wrote repcacheman to help new users create the cache directory and move your existing files from /usr/portage/distfiles to the cache directory.  It also checks those files for corruption, just in case.  Portage leaves incomplete downloads in /usr/portage/distfiles, repcacheman won't copy those files to the replicator cache.

Now back to your error.  This is caused by a gentoo developer forgetting to include a digest file for a new or changed ebuild.  My guess this will be fixed by now.  Just sync and try it again.  If that doesn't work, just copy all the packages (/usr/portage/distfiles) to the cache dir yourself (/var/cache/http-replicator), and tell me your portage version.  I'll add code to catch this error as soon as I get the chance....

----------

## flybynite

Updated HOWTO to ver 1.4

----------

## senectus

 :Shocked:   :Confused:  ye gods..

I have 4-5 gentoo boxes on my home lan.. with a view to expand.. so I thought I'd be a responsable gentooist and make my own rsync.. so I started here:

https://forums.gentoo.org/viewtopic.php?t=59134&postdays=0&postorder=asc&highlight=rsync+mirror&start=0

which was great until I saw this post in the thread (halfway through making the mirror of course)

https://forums.gentoo.org/viewtopic.php?t=110973&postdays=0&postorder=asc&start=0

Then I thought.. ooooh this is much better.. so I started doing that.. until I noticed THIS THREAD!

Now I think I'm going to use the first link to do my rsync updates and this threads solution to actually get the files..   :Shocked:   :Laughing:  (that "should" work shouldn't it?)

anyway.. thanks for the fantastic voyage of discovery...   :Laughing:  but I think in future I'll surf this forum with my eyes closed..   :Wink: 

----------

## dhurt

Thanks for the excellent program.  I went through the exact same chain of programs as senectus  :Wink:   Works like a charm  :Very Happy: 

----------

## flybynite

OK, your killing me knowing there is some outdated info out there.  The rsync mirror described by the HOWTO: Central Gentoo Mirror for Internal Network at https://forums.gentoo.org/viewtopic.php?t=59134 is currently outdated (as of June 1, 2004) and uses insecure options.  Although I believe there will be a fix in a later version of rsync, till then, following that old howto is insecure!!

I created an updated version - HOWTO:Local Rsync Mirror at https://forums.gentoo.org/viewtopic.php?t=180336

----------

## dhurt

Now that will be one more post to go through in the chain of different options  :Wink: 

----------

## dhurt

In the above setup, is there any problem with using the cache directory as the source directory for server to get the files it needs?

So for instance setting in /etc/make.conf

```

PKGDIR= /var/cache/http-replicator

```

I was wondering if this would cause write issues as the proxy is downloading the file to cache directory while giving it to the localhost which is in turn saving it to the same place.  I guess this could be overcome by turning off the http_proxy for the localhost?  Is there a problem with this configuration, something that I am not seeing?

Thanks for the program   :Very Happy: 

----------

## flybynite

 *ender2431 wrote:*   

> In the above setup, is there any problem with using the cache directory as the source directory for server to get the files it needs?
> 
> So for instance setting in /etc/make.conf
> 
> ```
> ...

 

Well PKGDIR is for the .tbz2 binary packages portage creates, I think you meant DISTDIR by your description, which is where portage stores it's downloaded tarballs?

Using DISTDIR (/usr/portage/distfiles) for replicator's cache is not recommended, for now.

That is the reason I created the 'repcacheman' script.  It manages portages DISTDIR and replicator's cache directory very intelligently.  With repcacheman, there really isn't any drawbacks to following my howto exactly and having DISTDIR separate from replicator's cache.

I originally intended replicator to share the directory with portage - but portage needs to learn how to share first  :Smile: 

----------

## mjr104

Hi,

I've just started to play with various portage caching ideas and have setup the http-replicator and the local rsync mirror.  Great!  

I'm slightly lost with all the options and suggestions so I have a quick question about binary packages if that's ok.

What's the best/simplest way to also share the binary packages that are in the packages directory of the machine hosting the replicator?  This machine has already built a large number of packages and I ultimately want this machine to build and serve all the packages for my network as they get updated.

I assume that the replicator only serves the files that (nominally) live in distfiles?  The option in make.conf for binary packages seems to suggest I need http or ftp access to a server - serving the packages directory?

Cheers

Mike

----------

## senectus

ok.. here is a carry on question to the one above..

Lets say that I want to build packages on lots of machines.. of different spec's then host the binaries on one PC so that I can have a whole host of binaries on the network for network installations (the idea is to make Gentoo a feasable distro to install at installfests)..

----------

## dhurt

Use FTP and then have a different directory for each architecture.  Then specify the location in /etc/make.conf and the bin host optoin:

 *Quote:*   

> 
> 
> # Portage uses PORTAGE_BINHOST to specify mirrors for prebuilt-binary packages.
> 
> # The list is a single extry specifying the full address of the directory
> ...

 

----------

## flybynite

I've submitted the updated ebuild, now it is up to the gentoo dev's to see if this makes it into the portage tree!!

I guess a couple of good words wouldn't hurt  :Smile: 

https://bugs.gentoo.org/show_bug.cgi?id=50872

----------

## flybynite

 *mjr104 wrote:*   

> 
> 
> What's the best/simplest way to also share the binary packages that are in the packages directory of the machine hosting the replicator?

 

http-replicator will serve any file that is in it's cache directory by http, but I never thought of using it with this portage feature.

I will look into this feature, it looks possible....

----------

## Ymerej

I installed and set up http-replicator, I did an emerge of a package on a client box, and observed that it went through the proxy -- great.

I got here basically by the same path as some other posters:  I didn't want to query the rsync servers too much, with "emerge rsync" on each of the three Gentoo machines on my LAN.  

I am going to ask a really naive question here:  Does this http-replicator also cache the "emerge rsync" so that I can do it once on the http-replicator server, and then on all my other gentoo boxes that use the http-replicator proxy, and not risk overloading rsync servers and getting banned?

Or do I still need to set up an rsync server on one machine on my LAN?

Thanks.

----------

## dhurt

You still need to set up the rsync mirror.  It is not cached by the http proxy.  Here is a link to the updated how-to:

https://forums.gentoo.org/viewtopic.php?t=180336&postdays=0&postorder=asc&start=0

----------

## soulwarrior

 *Ymerej wrote:*   

> 
> 
> Or do I still need to set up an rsync server on one machine on my LAN?
> 
> Thanks.

 

Yes, you do have do install a rsync server on your network to provide this functionality, but as most things on gentoo, it is quite easy to do   :Wink: 

```

emerge app-admin/gentoo-rsync-mirror

```

Make some changes to the /etc/rsync/rsyncd.conf configuration file:

```

uid = nobody

gid = nobody

use chroot = yes

max connections = 20

pid file = /var/run/rsyncd.pid

motd file = /etc/rsync/rsyncd.motd

transfer logging = no

log format = %t %a %m %f %b

syslog facility = local3

timeout = 300

[gentoo-x86-portage]

#this entry is for compatibility

#path = /opt/gentoo-rsync/portage

path=/usr/portage

comment = Gentoo Linux Portage tree

[gentoo-portage]

#modern versions of portage use this entry

#path = /opt/gentoo-rsync/portage

path=/usr/portage

comment = Gentoo Linux Portage tree mirror

exclude = distfiles

```

I have just changed the two path entries (path = /...) to point to my existing portage tree.

Now on every client, add this line to make.conf to point them to the rsync server on your network:

```

SYNC="rsync://192.168.0.1/gentoo-portage"

```

Just change the ip-address with the one from your server and you are done   :Wink: 

Ah, of course you have to start it:

```

rc-update add rsyncd default

/etc/init.d/rsyncd start

```

----------

## flybynite

I recommend all users with a LAN setup http-replicator using this HOWTO and also setup a local rsync server using the HOWTO at  https://forums.gentoo.org/viewtopic.php?t=180336

This will best serve your lan and gentoo!!

----------

## flybynite

I made a small change to the title of this thread.  I hope 'download' will make more people stop and look at what http-replicator can do for them!! 

Work has started on the next version of http-replicator!!

----------

## dhurt

Thanks for the hardwork  :Very Happy: 

----------

## Gherald2

Is there a way to configure the clients to fetch stuff themselves when the server is down?

I ask because I'd tell other people on our dorm network about my server, but I can only guarantee 90-95% uptime and I don't want their fetches to fail on my account...

----------

## deezee

This is a really good concept, but it seems to me that it would be a lot easier to mount /usr/portage/distfiles (and maybe even /usr/portage to solve the whole rsync thing as well) through NFS or samba, but of course this carries with it other problems. (Although none of them are unsolvable.)

Anywho, that's how I do it... :)

----------

## dhurt

One of the problems of NFS, Samba that this solves is when two clients start downloading the same file at the same time.

----------

## ejona

First of all, I would like to mention that I started reading on the board from the tsp program.  I was happy with what that one offered (and was okay with the bugs), but this has no downfalls what-so-ever that I can see.

I can say that using the how-to worked perfectly.  I am on a AMD64 processor and since it workes fine I would say the only thing I see that it needs is to add in amd64 to the KEYWORDS="x86" in the ebuild   :Very Happy: .

Thanks for the work (I am especially thankful - for I am on a modem connection and have 4 gentoo boxes - needless to say, I am also using a local rsync mirror) and I hope the developers include the package in portage soon.  I will also keep an eye out on ways to break it.

----------

## flybynite

Thanks for the report ejona, I've updated the ebuild to include amd64 in the keywords.

Can anyone confirm any other arch's?

----------

## larand54

I just tried to use this "http-mirror" but failed. 

 The server looks to be running but the client never tried to connect to it. 

```
 The only changes I did to Make.conf for the client was: 

 http_proxy="http://Jupiter:8080" 

 RESUMECOMMAND="/usr/bin/wget -t 5 --passive-ftp \${URI} -O \${DISTDIR}/\${FILE}" 

```

 I havn't changed the mirror line: 

```

GENTOO_MIRRORS="http://ftp.rhnet.is/pub/gentoo/ http://212.219.247.19/sites/www.ibiblio.org/ge$ 

```

 What more do I have to do with the clients? 

 I also use my server as a sync-server and it works just fine.

----------

## larand54

I finally made it!  :Very Happy: 

 *larand54 wrote:*   

> 
> 
>  The server looks to be running but the client never tried to connect to it. 
> 
> 

 

It didn't run - a misstake of me  :Embarassed: 

But when I got it running no data was transfered.

The problem was that I hadn't set up the IP-list in /etc/http-replicator.conf

When I did that and restarted the replicator worked.

----------

## Monkeywrench

This is awesome! Both this and your rsyncd howto. Thank you flybynite. You should contact the people who do the gentoo weekly newsletter about both of these howtos. It would save the generous mirror providers some bandwidth =)

----------

## flybynite

Thanks monkeywrench!!

I submitted the howto's for inclusion in GWN!!

The newsletter is probably the best way to get the word out!!

----------

## flybynite

 *larand54 wrote:*   

> The problem was that I hadn't set up the IP-list in /etc/http-replicator.conf
> 
> 

 

I tried to make the config generic so it would work with ALL of the most common private LAN ip's.  Did I miss a range or are you using a public ip range on your lan?

----------

## dmitrio

I have copied this HOWTO, with suggestion of flybynite, to gentoo-wiki.com 

http://gentoo-wiki.com/HOWTO_Download_Cache_for_LAN-Http-Replicator

If you see anything that should be added or changed, feel free to do so. 

Thank you for a great HOWTO.

----------

## meowsqueak

Ok, so http-replicator is set up as a proxy on my LAN for all my Gentoo boxes. But how do I set up the http-replicator itself to use an external proxy? I can't see anything in /etc/http-replicator.conf that might help.

----------

## flybynite

Sorry for the delay, my net access has been a problem for almost a week.

Your right, support isn't there for a proxy.

It has been added to the next version already in testing....

----------

## xkb

I get the following error on the machine running the proxy:

```

Resolving myhostname.net... someip

Connecting to myhostname.net[someip]:8080... connected.

Proxy request sent, awaiting response...

10:16:22 ERROR -1: No data received.

```

Any ideas whats causing this? I followed the howto.

The proxy works fine for the client machines.

```

>>> Downloading http://distro.ibiblio.org/pub/Linux/distributions/gentoo/distfiles/portage-2.0.50-r9.tar.bz2

--10:20:56--  http://distro.ibiblio.org/pub/Linux/distributions/gentoo/distfiles/portage-2.0.50-r9.tar.bz2

           => `/usr/portage/distfiles/portage-2.0.50-r9.tar.bz2'

Resolving mydomain.net... done.

Connecting to mydomain.net[someip]:8080... connected.

Proxy request sent, awaiting response... 404 Not Found

10:20:57 ERROR 404: Not Found.

 

>>> Downloading http://gentoo.twobit.net/portage/portage-2.0.50-r9.tar.bz2

--10:20:57--  http://gentoo.twobit.net/portage/portage-2.0.50-r9.tar.bz2

           => `/usr/portage/distfiles/portage-2.0.50-r9.tar.bz2'

Resolving mydomain.net... done.

Connecting to mydomain.net[someip]:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: 222,044 [application/x-tar]

```

After this it appears in /var/cache/http-replicator

----------

## Gherald2

Hmm, have you tried localhost:8080 instead?

----------

## meowsqueak

 *flybynite wrote:*   

> Your right, support isn't there for a proxy.
> 
> It has been added to the next version already in testing....

 

Oh this is good news! Great - I'll definitely be one of the first to test this new feature  :Smile: 

----------

## La`res

I have a Problem.

I emerge from 'jayden' (client) and I get the following from the telnet 'monitor'

```
HttpClient 109 Received header from 176.24.10.50:35782

  GET http://joey:8080/distfiles/openmotif-2.1.30-4_MLI.src.tar.gz HTTP/1.0

  User-Agent: Wget/1.9.1

  Host: joey:8080

  Accept: */*

HttpClient 109 Connecting to joey

HttpClient 110 Received header from 176.24.10.2:33215

  GET /distfiles/openmotif-2.1.30-4_MLI.src.tar.gz HTTP/1.0

  host: joey

  connection: close

  accept: */*

  user-agent: Wget/1.9.1

HttpClient 110 Error: 'NoneType' object has no attribute 'groups'

  TRACEBACK

  read (file /usr/lib/python2.3/asyncore.py, line 69)

  handle_read_event (file /usr/lib/python2.3/asyncore.py, line 390)

  handle_read (file /usr/bin/HttpReplicator.py, line 216)

  handle_header (file /usr/bin/HttpReplicator.py, line 278)

HttpServer 110 Closed
```

THen I need to shutdown the emerge from jayden manually. ( there seems to be no timeout)

If I emerge from 'joey' (server) I don't have this problem.

It seems to me that the server is calling itself and then barfing.

Any Ideas??

----------

## xkb

 *Gherald wrote:*   

> Hmm, have you tried localhost:8080 instead?

 

That did the trick! Thanks!  :Very Happy: 

----------

## jpc82

Problem fixed itself

----------

## dripton

Thanks for http-replicator.  I like it even better than tsp-cache.

For me, the main missing feature is the ability to fetch all distfiles from the local mirror, even those that aren't hosted on Gentoo http mirrors.  Manually stuffing files into one cache directory is a lot nicer than spreading them around to many distfiles directories.

I used the tip in bug 50872 to handle ftp and nomirror ebuilds, but there isn't an equivalent workaround for RESTRICT="fetch".  (Which is bug 37455.)  I hacked my local portage.py to work around this, but that will quickly get annoying to maintain.

----------

## meowsqueak

 *flybynite wrote:*   

> Your right, support isn't there for a proxy.
> 
> It has been added to the next version already in testing....

 

Any chance I could assist with the testing by installing it at my site?

----------

## JohnHerdy

 *flybynite wrote:*   

> Sorry for the delay, my net access has been a problem for almost a week.
> 
> Your right, support isn't there for a proxy.
> 
> It has been added to the next version already in testing....

 

Hi flybynite, do you have something we can test/use? I think this is great software!!! however without proxy support I'm not able to use this.   :Crying or Very sad: 

----------

## meowsqueak

 *JohnHerdy wrote:*   

>  *flybynite wrote:*   Sorry for the delay, my net access has been a problem for almost a week.
> 
> Your right, support isn't there for a proxy.
> 
> It has been added to the next version already in testing.... 
> ...

 

me too    :Cool: 

----------

## cdunham

Nice HOWTO!

A couple of issues:

- The ebuild requires 'inherit eutils' for epatch to work

- repcacheman works great, but you may want to note that if you choose not to use it, you need to manually create the cache directory and set its permissions.

Thanks! This is what makes Gentoo such a great distribution!

----------

## flybynite

I guess something changed that 'inherit eutils' is now required.  I'll add that to the ebuild, thanks.

I promise to find time to get an ebuild out in the next couple of days for the version with proxy support.

----------

## flybynite

 *cdunham wrote:*   

> Nice HOWTO!
> 
> - repcacheman works great, but you may want to note that if you choose not to use it, you need to manually create the cache directory and set its permissions.
> 
> 

 

The ebuild prints this after merging:

 *Quote:*   

> 
> 
> *
> 
>  * If you wish to change defaults
> ...

 

Did you miss this or is this not clear?

----------

## flybynite

 *dripton wrote:*   

> Thanks for http-replicator.  I like it even better than tsp-cache.
> 
> For me, the main missing feature is the ability to fetch all distfiles from the local mirror, even those that aren't hosted on Gentoo http mirrors. 
> 
> I used the tip in bug 50872 to handle ftp and nomirror ebuilds, but there isn't an equivalent workaround for RESTRICT="fetch".  (Which is bug 37455.) 

 

The tip in bug 50872 was added to the HOWTO in ver 1.4 so others don't need to worry about it, however, If you installed http-replicator prior to howto ver 1.4 you may want to read the /etc/portage/mirrors section.

I may even add the file to the ebuild.  Most users don't have the file, smart users can merge the file?

The restrict=fetch problem is strictly a portage issue.  This is not the only portage quirk that doesn't work well with others.  http-replicator could share /usr/portage/distfiles if portage chown'd files to the portage user  :Smile: 

----------

## cdunham

 *flybynite wrote:*   

> 
> 
> The ebuild prints this after merging:
> 
>  *Quote:*   
> ...

 

No, it's clear about using it, but in the HOWTO you say:

 *Quote:*   

> repcacheman is nice to help install and manage http-replicator on gentoo but isn't needed for replicator to work.

 

which is true, as long as the cache dir is set up properly, that's all.

----------

## flybynite

Ok, I'll check the wording in that line.

I added that because another user had trouble with repcacheman and didn't know replicator was still working for him.  It is confusing.

----------

## flybynite

Updated HOWTO - version 1.5

----------

## ShaunC

 *flybynite wrote:*   

> 
> 
> Can anyone confirm any other arch's?

 

Yep. http-replicator/repcacheman work just fine on alpha.

----------

## flybynite

Thanks, I add that to the ebuild.

----------

## meowsqueak

 *flybynite wrote:*   

> I promise to find time to get an ebuild out in the next couple of days for the version with proxy support.

 

How's that promise coming along?  :Wink: 

----------

## flybynite

I have been working on it.  The new version that includes proxy support is so different that the entire ebuild/package/patches will have to be redone  :Sad: 

This is a major amount of work when most users will see no improvements.

I know you need proxy support so I'm considering backporting the proxy support.  I'm testing to see which is the better option.

----------

## meowsqueak

Thanks for sticking with it. I think as this becomes a more popular solution, there will be increasing desire for proxy support. However at the moment, I appreciate the time and effort you are putting into this and if there's anything I can do to assist (especially testing) then please let me know. Private message me if necessary.

----------

## woZa

Great work flybynite. I had this up and running in 5 mins. Now reclaimed 3.5gigs from /usr... just need to upgrade my servers ide controller (ata 33 for ata 133 drive!) to get the real speeds! Another thing to get round to sometime!

----------

## JohnHerdy

 *flybynite wrote:*   

> I have been working on it.  The new version that includes proxy support is so different that the entire ebuild/package/patches will have to be redone 
> 
> This is a major amount of work when most users will see no improvements.
> 
> I know you need proxy support so I'm considering backporting the proxy support.  I'm testing to see which is the better option.

 

hi flibynite,

Thanks a lot for all your effort you put in http-replicator.

I agree with you that most users will see no improvements however the users that work in an enterprise environment have the greatest benefit by using this program. I mean it's nice that John Doe with 2 computer at home is able to use this program but real bandwith savings and load reduction on the mirrors can be achieved in the enterprise. Unfortunatelly in an enterprise environment you always have to go through a proxy. 

Why don't you post what you have now and we can help you backport it or create an ebuild for the new version.

I hope I don't sound pushy but I really like your program.

----------

## scoy

What benefits does this have over NFS exports?

I use one machine to export its /usr/portage/distfiles, then the rest(2) of my machines use that.  I export it with no_root_squash, so all machines can store the files if they download them.

----------

## JohnHerdy

 *scoy wrote:*   

> What benefits does this have over NFS exports?
> 
> 

 

If a file is on a NFS-volume and two or more hosts need that file at the same time all hosts will try to download that file. This will result in an unsuccesfull download. A powerfull feature of http-replicator that it's able to stream partial downloads.

----------

## woZa

When I emerge an app from a client that isn't on the server the file is downloaded to both the server http-replicator cache dir and the client distfiles dir... Is this normal behaviour?

There is no need for the files to remain in the client distfiles dir is there? Should I just use cron to empty the client distfiles dir daily for example.

Thanks in advance

----------

## flybynite

 *woZa wrote:*   

> When I emerge an app from a client that isn't on the server the file is downloaded to both the server http-replicator cache dir and the client distfiles dir... Is this normal behaviour?

 

Yes, The clients require the file in the distfile dir before portage can emerge it.

 *woZa wrote:*   

> 
> 
> There is no need for the files to remain in the client distfiles dir is there? Should I just use cron to empty the client distfiles dir daily for example.
> 
> Thanks in advance

 

Once emerged, you can delete all files in the clients distfile dir.  

A cron to do this would be fine as long as you don't delete while emerging.  It would be rare and small chance, but imagine an emerge -u world going overnight when the cron script kicks in and deletes a file that portage is still downloading or just about to unpack (any other time would be no problem).  While not catastrophic after all since the package is still in http-replicators cache, I think it would make portage error and stop the emerge.

Since I do many upgrades overnight, that small window between the start of the package download and the beginning of the unpack is why I don't have a cron script delete the files on my computer.  I just do it manually every once in a while which works fine for me.  It's Ok if you wanted to use cron, the most that could happen is that on a rare ocassion, that long list of upgrades won't be finished in the morning...

the repcacheman script automatically manages the files in the distfile dir on the server box.  Just run repcacheman either occasionally or at most after every upgrade on the server.

----------

## woZa

thanks flybynite...

----------

## flybynite

New version ready for testing!!  Although the init code,logging, and options are cleaned up, this version primarily adds external proxy support.

Current users not needing external proxy support should wait for the next stable release.  Experienced users can help test.

If your don't have any version of http-replicator installed and wish to help test, follow the HOWTO first, then upgrade to this latest version.

1. Download and install the ebuild in the portage overlay directory (/usr/local/portage by default), then emerge http-replicator.

Download: 

 http-replicator-flybynite-1.5.tar.bz2 

which contains both the old and new ebuild's.

```

# cd /usr/local/portage

# tar -xvjf /PathToFile/http-replicator-flybynite-1.5.tar.bz2

```

Then emerge:

```

# emerge -u http-replicator

```

This will upgrade http-replicator.

Edit /etc/conf.d/http-replicator to add your external proxy and check other defaults.

**Note repcacheman has changed.  If you changed the default cache dir or the default user, you must call repcacheman with those options.  

```

repcacheman --user USER --dir /path/to/cache

```

2.  You can downgrade http-replicator if you have any problems with the new version

```

# emerge  =net-misc/http-replicator-2.0-r2

```

Please report your experience!!

[Edit] update to rc3Last edited by flybynite on Fri Aug 06, 2004 7:32 pm; edited 2 times in total

----------

## Boworr

Hi, 

Firstly, this is great and saves me a huge amount of time/bandwidth at home.

Is it possible to manually add items to the cache? For example, when I'm at work I'd like to use the companies bandwidth  :Smile:  to fetch all my packages, then when I get home copy them over the cache server so the other machines can get them quickly.

Is this possible? If so, what do I copy (where does emerge store temp downloads anyway?) and where do I copy it to?

Thanks

----------

## flybynite

Yes, you can manually add files to the cache.  The included script repcacheman will do it for your also.

The safest method would be to copy the files to /usr/portage/distfiles and then run repcacheman.  repcacheman is a script designed to import files that were downloaded from ftp or manually added.  The script will verify the md5sum of all files before adding them to replicators cache.

The fastest method would be to just copy the files directly to replicators cache which is /var/cache/http-replicator unless you changed the defaults.  The only drawback to this method is that the md5sum's aren't checked.  If any files your downloaded had errors in them you will have to manually delete them from the cache.  

The latest version of replicator will md5sum replicators cache, it's just still in testing.

----------

## meowsqueak

 *flybynite wrote:*   

> New version ready for testing!!

 

Excellent! I will definitely test this out and let you know how it goes. Thanks!

----------

## Boworr

 *flybynite wrote:*   

> Yes, you can manually add files to the cache.  The included script repcacheman will do it for your also.

 

So where do I copy them from?  :Razz:  i.e, where does emerge store files with I --fetch them?

Boworr

----------

## flybynite

Emerge stores all the files in /usr/portage/distfiles

Copy them from there to the /usr/portage/distfiles on the box with replicator and then run repcacheman

----------

## carpman

Hello,  ok trying to save some disk space on my note book and decrease bandwidth for emerge sync etc i setup rsync mirror and http-replicator... very nice but i have some question as i now plan to build a small local server to be the main rsync/http-replicator.

1. getting rid of /usr/portage on clients, currently i have nfs exported the /usr/portage from server to client... is this good idea? would love to get rid of /usr/portage dir on local machines and just have /usr/portage/http-replocator cache on server

2. disk space and dir partitions for server - with the server i plan to have a seperate partition for portage files, from what i can gather /usr/portage is not  that important or large, with http-replicator and the reprocache on cron moving the files to http-replicator cache dir, and it is this cache dir that i can give its own partition.

Am i correct?

cheers

----------

## carpman

umm, seems that nfs exporting /usr/portage to client does not work  :Sad: 

rsync mirror fails and emege -uD world just seems to go on for ever finding dependancies.

h

----------

## meowsqueak

/usr/portage should be stored somewhere FAST and accessing it over NFS is relatively slow. Every time Portage builds a dependency tree, it has to read lots of files from here.

----------

## JohnHerdy

 *flybynite wrote:*   

> New version ready for testing!!  Although the init code,logging, and options are cleaned up, this version primarily adds external proxy support.
> 
> 

 

yahoo!!!, I will test it and report back, thanks a lot!!!

----------

## JohnHerdy

I'm testing the proxy support. Can you tell me where to add my username and password?

----------

## flybynite

New http-replicator-2.1_rc3  Includes support for basic proxy authentication (as requested by JohnHerdy)

Everybody testing please update to the latest version and test.  You will need to download http-replicator-flybynite-1.5.tar.bz2  Instructions are in this post:

https://forums.gentoo.org/viewtopic.php?t=173226&start=76

----------

## JohnHerdy

 *flybynite wrote:*   

> New http-replicator-2.1_rc3  Includes support for basic proxy authentication (as requested by JohnHerdy)

 

Man you have no idea how happy you have made me. It's working like a charm!!!

Minor improvement;

- if the cachedir doesn't exists when the daemon starts create it and set the correct permissions or else the daemon crashes.

- in the default /etc/conf.d/http-replicator add an " at the end of the --external examples, because without the " at the end http-replicator crashes.

----------

## flybynite

Your welcome!!

 *JohnHerdy wrote:*   

> 
> 
> Minor improvement;
> 
> - if the cachedir doesn't exist when the daemon starts create it and set the correct permissions or else the daemon crashes.
> ...

 

Actually, replicator will print 'invalid directory' or 'no read/write permission for directory' and then exit  :Smile: 

You might have missed it because the instructions are only in the complete HOWTO at the start of this thread, but...

I wrote the repcacheman utility to install http-replicator.  It will install and maintain replicators cache.  It creates the cache dir if it doesn't exist and then transfers files from the host to the cache.

If your server box is gentoo, you should still run repcacheman to prime the cache with your existing files.

In this release candidate version, if you changed the cache dir from the default /var/cache/http-replicator, or changed the user from portage, you need to set the cache dir and/or the user on the command line like so

```

# /usr/bin/repcacheman  --dir /var/cache/http-replicator  --user portage

```

 *JohnHerdy wrote:*   

> 
> 
> - in the default /etc/conf.d/http-replicator add an " at the end of the --external examples, because without the " at the end http-replicator crashes.

 

Thanks, my typo...  Fixed...

----------

## carpman

I have been building a new local server, will be http-replicator server, and so thought that using rsync mirror and http-replicator which are setup on my workstation would be good idea.

When i got to to chroot stage of install i downloaded a copy of my notebooks make.conf as it has same cpu and already had rsync and replicator settings which work fine. I amended USE setting and did emerge sync, all went well.

Thin  i did bootstrap, all went well. The i did emerge system and all seemed to be going well with packages being grabbed from http-replicator cache.

I went to bed.

In morning i found that it had stalled on python, it appeared to have downloaded pyhton 3 times at 7mb  but each time emerge failed saying python could not be downloaded.

I disabled http-replicator in make.conf and tried again, this time it resumed python download from 3mb and finished it ok with total 7mb. This i thought strange and check cache dir and found that in there python indeed was only 3mb. 

So http-replicator found python in cache dir but did not recognise that it was an incomplete download so kept serving it as full file and so emerge kept failing.

any way of preventing this?

I am using ver 2.0-r2

cheers

----------

## flybynite

I apprears you may be using an old replicator config or have an error in the setup.  I still want to check it out though.

You mentioned in your other posts about sharing  /usr/portage over nfs.  You weren't doing something like that were you?

Could you PM me your make.conf, replicator config  and http-replicator logs?  Plus any other info like either your emerge.log or a copy of the last output from the emerge?Last edited by flybynite on Wed Aug 11, 2004 1:11 am; edited 1 time in total

----------

## meowsqueak

Oh, is /etc/http-replicator.conf deprecated now? I only just noticed the configuration file is now /etc/conf.d/http-replicator.conf. Does this mean the telnet monitor port is now unsupported?Last edited by meowsqueak on Sun Aug 08, 2004 10:55 pm; edited 1 time in total

----------

## meowsqueak

Looking good! Latest version with proxy support seems to be working well here  :Smile: 

How difficult would it be to add bandwidth-limiting on the http-replicator - external proxy transfer? Does http-replicator use a new wget process to request the file from the external proxy, or does it send the HTTP commands directly itself?

----------

## flybynite

 *meowsqueak wrote:*   

> Oh, is /etc/http-replicator.conf deprecated now? I only just noticed the configuration file is now /etc/conf.d/http-replicator.conf. Does this mean the telnet monitor port is now unsupported?

 

Yes, the testing version of http-replicator uses /etc/conf.d/http-replicator for the config.

telnet monitor was removed now that the logging is much improved.  You can use 'tail -f /var/log/http-replicator.log' from a local terminal or ssh for the same type of monitor.

----------

## flybynite

 *meowsqueak wrote:*   

> Looking good! Latest version with proxy support seems to be working well here 
> 
> How difficult would it be to add bandwidth-limiting on the http-replicator - external proxy transfer? Does http-replicator use a new wget process to request the file from the external proxy, or does it send the HTTP commands directly itself?

 

New features are planned  :Smile: 

Bandwidth limiting wasn't one being considered  :Sad:  Nobody ever asked for it before...  Http-Replicator is all Python and doesn't use any external programs such as wget.

----------

## carpman

 *flybynite wrote:*   

> I apprears you may be using an old replicator config or have an error in the setup.  I still want to check it out though.
> 
> You mentioned in your other posts about sharing  /usr/portage over nfs.  You weren't doing something like that were you?
> 
> Could you PM me your make.conf, replicator config  and http-replicator logs?  Plus any other info like either your emerge.log or a copy of the last output from the emerge?

 

Hello, i had tried sharing portage over nfs but it did not work well and it was not being done at the time i have problem, though the file concerned may have been affected whe i had.

Do you want the conf/log files from the server or client or both?

----------

## Gherald2

carpman, if for whatever reason you do decide to go with an NFS share instead of http-replicator you probably only want to share /usr/portage/distfiles, not the entire /usr/portage directory.

Now about your problem, my guess is you haven't set RESUMECOMMAND in you make.conf as outlined in the first post to this thread.  But even with the correct one,  sometimes downloads fail for strange reasons and you have to manually flush the cache, that is, on the server:

rm /var/cache/http-replicator/<filename>

... where <filename> is the file that failed to download.

----------

## JohnHerdy

 *flybynite wrote:*   

> New features are planned 

 

Can you tell us more?

I have one feature request; I would like to see in the logfile the IP-address of the proxyclient that is requesting a file. Then it's possible to setup some auditing and prevent abuse. Thanks a lot!

----------

## carpman

 *Gherald wrote:*   

> carpman, if for whatever reason you do decide to go with an NFS share instead of http-replicator you probably only want to share /usr/portage/distfiles, not the entire /usr/portage directory.
> 
> Now about your problem, my guess is you haven't set RESUMECOMMAND in you make.conf as outlined in the first post to this thread.  But even with the correct one,  sometimes downloads fail for strange reasons and you have to manually flush the cache, that is, on the server:
> 
> rm /var/cache/http-replicator/<filename>
> ...

 

I have stopped using nfs share of /usr/portage

The problem only occured on a new build of gentoo not updates to other machines on network. I did remove offending file from cache.

This is the relevent part of make.conf from client:

```

PORTAGE_TMPDIR=/var/tmp

PORTDIR_OVERLAY=/usr/local/portage

# GENTOO_MIRRORS="http://www.mirror.ac.uk/sites/www.ibiblio.org/gentoo http://gentoo.oregonstate.edu http://www.ibiblio.org/pub/Linux/distributions/gentoo"

# Default fetch command (5 tries, passive ftp for firewall compatibility) 

http_proxy="http://192.168.1.5:8080" 

#FETCHCOMMAND="/usr/bin/wget -t 5  --passive-ftp \${URI} -P \${DISTDIR}" 

RESUMECOMMAND=" /usr/bin/wget -t 5 --passive-ftp  \${URI} -O \${DISTDIR}/\${FILE}"

# SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"

SYNC="rsync://192.168.1.5/gentoo-portage"

RSYNC_RETRIES="3"

```

As you can see i did have the resume section uncommented.

----------

## Gherald2

Like I said, strange things happen even with the correct resume command.  I have had such problems from time to time, the solution is to delete the file from the cache and start over.

The way to be safest when doing a long emerge is to fork an emerge -f process:

Open up a second console and add -f to your regular emerge command.  Once the first file is downloaded, go back to the first terminal and run your emerge w/o -f.

The downloads will complete sooner so you can make sure everything is fetched ok for whatever long build you are doing.

----------

## flybynite

 *carpman wrote:*   

> 
> 
> This is the relevent part of make.conf from client:
> 
> ```
> ...

 

I think I can spot the primary source of the errors here.  Your are using the default mirrors!!!

During the development of http-replicator I had many weird errors and I traced this back to the default mirrors.  They are so overloaded that frequent timeouts and random networks errors are common.  I had particular trouble with ibilio.  I removed all ibiblio mirrors from my system.

Since I changed my mirrors those problems went away!!

I have two recomendations

1. Use other mirrors by uncommenting the GENTOO_MIRRORS line in your make.conf.

2.  Remove any ibiblio mirrors.

emerge mirrorselect if you need help in finding fast mirrors

----------

## flybynite

 *Gherald wrote:*   

> Like I said, strange things happen even with the correct resume command.  I have had such problems from time to time, the solution is to delete the file from the cache and start over.
> 
> 

 

Which mirrors are you using?

----------

## carpman

flybynite thanks for the info i will try the mirror suggestion.

However this still does not answer my question as to why http-replicator was downloading an incomplete file from the cache and reporting it as being complete?

Surley there should be a safe guard so this does not happen!

----------

## Gherald2

 *flybynite wrote:*   

>  *Gherald wrote:*   Like I said, strange things happen even with the correct resume command.  I have had such problems from time to time, the solution is to delete the file from the cache and start over.
> 
>  Which mirrors are you using?

 

tds.net, gentoo.chem.wisc.edu, and a michigan lug

all very reliable AFACT, especially tds.net which I use for most things.

Do you suppose it would be possible to get http-replicator to automatically delete a failed file and retry?

----------

## flybynite

 *carpman wrote:*   

> However this still does not answer my question as to why http-replicator was downloading an incomplete file from the cache and reporting it as being complete?
> 
> Surely there should be a safe guard so this does not happen!

 

There is room for improvement of course, but replicator does have many safeguards to prevent this from happening.  I believe it takes a double error of a broken response from a server AND a SIMULTANEOUS network error to make this happen!!   This is only in theory and I have never been able to reproduce this type of error.

The most difficult question is how is replicator to know the file is incomplete?  

http-replicator works like a http server similar to apache.  Did apache check to make sure the page your viewing is complete?  No, it can't, it just blindly retrieves html pages from a directory and sends it to your browser.  Do the public mirrors md5 the packages in their directories?  No they can't, most aren't even gentoo, they depend on the upstream server only sending good files and standard network error checking.

replicator could check every file against a mirror before serving that file to clients.  This would mean that if the net was down, replicator wouldn't serve a file that is already in it's cache.  So replicator couldn't be used to help install a gentoo system without internet access like it can now.  It would also mean that you couldn't serve custom packages to system on your lan, should replicator refuse to serve those packages because they don't exist on a public mirror?

I could have replicator use some portage features to check the md5 but then replicator wouldn't run on a non-gentoo box.  Many gentooers have to deal with a university/employers choice of servers.  Should replicator be changed to only work with gentoo?  Now it can easily work on a debian/Fedora/whatever server at a university.

Now, having said that, I do have plans to combat possible errors.  repcacheman will be expanded to check replicators cache dir.  This way gentooers can take advantage of portage and it's md5's but replicator itself is still very portable.   Then another audit of the code in replicator can trap even more possible errors. 

Remember that I have tested everything I can think of to make replicator NOT put an incomplete file in the cache.  I challenge everyone to make replicator do so.  I've killed processes, mangled routing, pulled network cables, unplugged power from routers, rebooted during download, etc, etc, and not once has replicator ever put an incomplete file in the cache.  

repcacheman is an extra script that will run on gentoo systems that does check md5's.  I will change the script to automatically check the cache, but you can check your files now.

If you want to check the files in replicator's cache now, just

```

mv /var/cache/http-replicator/*  /usr/portage/distfiles/

/usr/bin/repcacheman

```

repcacheman will import 15,000+ md5 from portage and check every file in /usr/portage/distfiles.  The files that pass md5 will be moved to replicator's cache.

----------

## Gherald2

 *flybynite wrote:*   

> The most difficult question is how is replicator to know the file is incomplete?

 Well why not something crude like filesize?  MD5s are all well and good but let portage take care of the detail work on it's own.

It'd be nice if httpreplicator just payed attention to obvious things like file size (e.g. a 3mb incomplete when it should be a full 8mb download) and thus know when to restart downloads automatically.

But even that's more complicated that it needs to be, for a start it'd be nicer if we could configure portage to bypass the http-replicator proxy completely whenever a download fails or an MD5 doesn't match.

----------

## flybynite

 *Gherald wrote:*   

> Well why not something crude like filesize? 

 

I guess my post wasn't clear.  replicator ALREADY uses filesize!!  Thats why I said the only way replicator could possible save an incomplete file is if the MIRROR doesn't send the size or sends the wrong size AND there is a network error!

This problem isn't replicator specific!  Note this option from the man page of wget as proof of broken servers:

```

--ignore-length

           Unfortunately, some HTTP servers (CGI programs, to be more precise) send out bogus "Content-Length" headers, which makes Wget go wild, as it thinks not all the document was retrieved.  You can spot this syndrome if Wget retries getting the same document again and again, each time claiming that the   (otherwise normal) connection has closed on the very same byte.

           With this option, Wget will ignore the "Content-Length" header---as if it never existed.

```

 *Gherald wrote:*   

>  for a start it'd be nicer if we could configure portage to bypass the http-replicator proxy completely whenever a download fails or an MD5 doesn't match.

 

replicator would work better with some help from portage  :Smile:   If the ebuild ever gets accepted I hope to see some help from portage.

----------

## Gherald2

Someone with the right know-how could probably hack up a quick wrapper to emerge that would execute

```
http_proxy="" <previously failed emerge command>
```

Heck, if you wanted to get fancy it'd be feasable to automagically fork an "emerge --fetchonly" process to the background and only restart *IT* upon a failed DL.  Just a convenience, of course, especially for server admins who wish to minimize downtime (or for that matter, maintenance-time)

----------

## carpman

Hello, i now have http-replicator setup on net and it is working fine, except one small issue.

Before using H-R i used this script to clean old ebuild version from /usr/portage/distfiles

The problem now is that even after editing the paths in script i can't get it to work on /var/cache/htt-replicator

Any way i can get this to work or get the feature built into H-R?

cheers

----------

## flybynite

 *carpman wrote:*   

> Hello, i now have http-replicator setup on net and it is working fine, except one small issue.
> 
> Before using H-R i used this script to clean old ebuild version from /usr/portage/distfiles
> 
> The problem now is that even after editing the paths in script i can't get it to work on /var/cache/htt-replicator
> ...

 

That script works for me...  I don't see any reason why it shouldn't work for you...  What kind of error do you get?

There are many different types of cache cleaning scripts, they all should work with http-replicator.....

----------

## carpman

 *flybynite wrote:*   

> 
> 
> That script works for me...  I don't see any reason why it shouldn't work for you...  What kind of error do you get?
> 
> There are many different types of cache cleaning scripts, they all should work with http-replicator.....

 

There were no errors it just did not find the old ebuilds that i knew were there?

I will double check paths and try again.

----------

## meowsqueak

Would this program work properly as-is on a non-Gentoo host? I'm thinking of using my Debian server as the http-replicator proxy but I'm not sure how coupled it is to emerge/portage.

----------

## rpcyan

I've been using your http-replicator for a week now and so far I have been very impressed.  I currently have 6 gentoo boxes including the server and I haven't done a emerge world -uDp in awhile, therefore I have built up a descent cache and have been getting great speeds.

However, I am working in a college enviroment and I don't want anybody taking advantage of the open http proxy on the server.  Somebody theoretically could use it to do "bad things" and implicate me in the process.

We have DHCP on campus that I have no control over, and I can't guarantee the IP of my client machines for any amount of time.  Is it possible to make a whitelist of authorized clients based on DNS instead of IP?

Alternatively, is there any way to limit http-replicator to the gentoo mirrors in /etc/make.conf?  I just set my proxy settings appropriately and I was able to pull up google.com, which is obviously not necessary for gentoo caching.

Again, great piece of software.  Glad to do my part in minimizing the bandwidth loads on the public mirrors.

----------

## yahewitt

OK -- I "sort of" have this running. I installed the http-replicator on the server & configured the *.conf details as in the Howto on the "server" & make.conf on my one "client". 

When emerging something on the client that has already been emerged on the server I get the package via the LAN -- but when I emerge on the client, a copy does not seem to be kept at the server machine, so if I then emerge this same thing on the server, it downloads again. I can't see where I've screwed up the config (assuming this isn't supposed to happen!) -- can anyone suggest where to look in terms of debug information on what the proxy does with the copy on the server machine?

thanks!

[*doh*] Nevermind - I screwed up the permissions on the cache directory! It's working fine now, a neat bit of work.Last edited by yahewitt on Mon Aug 30, 2004 12:53 am; edited 1 time in total

----------

## rpcyan

Well, the best way to diagnose your problem is for you to post your http-replicator.conf

I assume you have ran repchacheman?

----------

## ponion

I'm about to try and install Gentoo 2004.2 on a second machine.

I'de like to try and use http-replicator on my first machine to save the second machine getting everything from the internet.

I've read through this thread, and it is unclear what the latest version is that I should be using.

The first post in the thread says 

 *Quote:*   

> 
> 
> "To install on the server:
> 
> 1. Download and install the ebuild in the portage overlay directory (/usr/local/portage by default), then emerge http-replicator.
> ...

 

But at the top of the page it says 

 *Quote:*   

> 
> 
> *** Unstable version including external proxy support - https://forums.gentoo.org/viewtopic.php?t=173226&start=76
> 
> 

 

Which point at 1.5 ?????

So what is the latest stable version ?

Peter.

----------

## flybynite

 *meowsqueak wrote:*   

> Would this program work properly as-is on a non-Gentoo host? I'm thinking of using my Debian server as the http-replicator proxy but I'm not sure how coupled it is to emerge/portage.

 

Yes!!!, although I don't have a non-gentoo machine to test  :Smile:    I took care NOT to make the base cache gentoo specific for just this reason.  Only my helper script repcacheman is gentoo-specific and wouldn't work with debian but isn't required with debian either!  Without repcacheman, you must manually create the cache dir and set permissions when installing.  The other functions in repcaceman aren't needed on debian.  You just need to have some kind of cache cleaner when disk space becomes a problem.  This is a script to delete files over X days old around the board somewhere.Last edited by flybynite on Tue Aug 31, 2004 5:58 pm; edited 1 time in total

----------

## flybynite

 *rpcyan wrote:*   

> I've been using your http-replicator for a week now and so far I have been very impressed. (snip)  I have built up a descent cache and have been getting great speeds.
> 
> 

 

Thanks!

 *rpcyan wrote:*   

> 
> 
> However, I am working in a college environment and I don't want anybody taking advantage of the open http proxy on the server.  Somebody theoretically could use it to do "bad things" and implicate me in the process.
> 
> 

 

Http-Replicator isn't an "open" proxy server!  You should know that http-replicator was designed with security in mind.  It has been thoroughly tested against security scanners and several security checks are in place!!  ( Nessus still shows 1 false positive if you test it yourself)  The IP access restrictions are part of those security checks.  Another restriction is that only access to port 80 is allowed.  See the code for more. 

 *rpcyan wrote:*   

> 
> 
> We have DHCP on campus that I have no control over, and I can't guarantee the IP of my client machines for any amount of time.  Is it possible to make a whitelist of authorized clients based on DNS instead of IP?
> 
> 

 

Sounds like your running a cache for your friends and not the whole campus?  The best thing might be to convince the campus admins how much bandwidth they could save by running a cache for everybody!  Stats will be in the next release to help convince them!!

Anyway, back to your question.  I've considered this already and decided for now that http-replicator isn't the right place to implement this.   What you probably want is to dynamically firewall your box.  There are scripts on the net that will allow you to set firewall rules based on log entries or various conditions you set.  Dynamic firewall rules are the cleanest solution.

 *rpcyan wrote:*   

> 
> 
> Alternatively, is there any way to limit http-replicator to the gentoo mirrors in /etc/make.conf?  I just set my proxy settings appropriately and I was able to pull up google.com, which is obviously not necessary for gentoo caching.
> 
> 

 

There is ongoing work on this, but there isn't a clean solution yet.  Portage will break if you limited the mirrors to those in /etc/make.conf  

In addition, the cache would be severely limited if you try to limit the upstream mirrors to which it can connect.  For example, all ebuilds contain their own "homepage" download mirror source.  These can change with every sync!!  Many packages are not mirrored but must be downloaded from the "homepage" server.  Others are too new and aren't in the mirrors yet, etc, etc.  The end result is that I can't predict the mirror that a package will come from and the users emerge could fail if there isn't a backup ftp source for the file.

If you need some filtering of requests, Http-Replicator can pass all requests to a filtering proxy such as squid or any one of the hundreds of other filtering proxies.  If that isn't enough for you, you probably shouldn't be running a proxy.

In summary:

Http-Replicator is a proxy with access restrictions and security checks.  It is designed with a campus LAN in mind where either you have the authority to have abusers prosecuted or have a certain level of trust in your clients.  All client requests are logged along with their IP.

  If you need more access limits look into dynamic firewall rules.  If you want to place more limits on where your users go, http-replicator can forward all requests to a filtering proxy such as squid or other filter/access limiter.Last edited by flybynite on Tue Aug 31, 2004 6:02 pm; edited 1 time in total

----------

## flybynite

 *ponion wrote:*   

> 
> 
> I've read through this thread, and it is unclear what the latest version is that I should be using.
> 
> 

 

Seems clear to me  :Smile:   Actually, it probably doesn't matter.  

The howto has the most tested version and complete instructions.  In the latest "unstable" version  the instructions only cover the changes from stable, not complete install instructions.   The unstable version adds external proxy support and some prettying of the code but both work equally well.....

So, if you don't need a proxy to get to the net, just use the howto!  You can upgrade later when I update the howto.   If you have to have the latest, I recommend following the howto and getting it working then upgrading to the latest version.

Actually the latest unstable version is very stable  :Smile:  I just haven't updated the howto to include it yet!!

----------

## rpcyan

 *flybynite wrote:*   

> 
> 
> Http-Replicator isn't an "open" proxy server!  You should know that http-replicator was designed with security in mind.  It has been thoroughly tested against security scanners and several security checks are in place!!  ( Nessus still shows 1 false positive if you test it yourself)  The IP access restrictions are part of those security checks.  Another restriction is that only access to port 80 is allowed.  See the code for more. 
> 
> 

 

Well, the fact that I can set firefox to the http-replicator proxy and pull up any website I want worries me.  I don't see a breach of the box occurring, but I do see the possibility of it being used for something it wasn't intended for.

 *flybynite wrote:*   

> 
> 
> Sounds like your running a cache for your friends and not the whole campus?  The best thing might be to convince the campus admins how much bandwidth they could save by running a cache for everybody!  
> 
> 

 

Well, again I figured the best thing to lock it down so that it couldn't be used for "bad things" is to use a whitelist.  Having the proxy pubic knowledge doesn't change that fact, but it does mean that I wouldn't be able to manage a whitelist.  As far as the campus admins go, they're pretty Red Hat and not the friendlist folk around.  Plus the fact that we have a huge pipeline, and I doubt they'd care much.

Your comment about running it through squid with filters will probably be the way to go for my situation

----------

## drakos7

Installed 2.1rc3 and it works great! Thanks flybynite!   :Smile: 

 :Question:  Does putting a script on the server in cron.daily to rm /usr/portage/distfiles/*  make sense to keep things clean?

----------

## flybynite

 *drakos7 wrote:*   

> Installed 2.1rc3 and it works great! Thanks flybynite!  
> 
>  Does putting a script on the server in cron.daily to rm /usr/portage/distfiles/*  make sense to keep things clean?

 

No, repcacheman does that and more.  Put repcacheman in cron.daily!!

----------

## robfantini

Hello,

 I've setup   H-P on one computer and it is running.

I  get this error on the client:

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... 400 Bad Request

21:19:14 ERROR 400: Bad Request.

----------

here is part of my /etc/make.conf  computer with H-P running:

http_proxy="http://192.168.1.3:8080"

RESUMECOMMAND=" /usr/bin/wget -t 5 --passive-ftp  \${URI} -O \${DISTDIR}/\${FILE}"

client: same lines....

 As I write this I suppose the problem is that I don't have httpd runnning on the server?

----------

## robfantini

I think that I've followed the install instructions exactally...   But I get the following error when I try to  emerge a program from the server:

Do you want me to merge these packages? [Yes/No] yes

>>> emerge (1 of 2) sys-apps/module-init-tools-3.0-r2 to /

>>> Downloading http://gentoo.oregonstate.edu/distfiles/modutils-2.4.26.tar.bz2

--14:25:07--  http://gentoo.oregonstate.edu/distfiles/modutils-2.4.26.tar.bz2

           => `/usr/portage/distfiles/modutils-2.4.26.tar.bz2'

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... '

14:25:07 ERROR 400: Bad Request.

.................................................................................

this is from /var/log/http-replicator:

HttpClient 10 Received header from 192.168.1.3:47408

  GET http://gentoo.oregonstate.edu/distfiles/modutils-2.4.26.tar.bz2 HTTP/1.0

  User-Agent: Wget/1.9

  Host: gentoo.oregonstate.edu

  Accept: */*

HttpClient 10 Connecting to gentoo.oregonstate.edu

HttpServer 10 Received header from 140.211.166.134:80

  HTTP/1.0 400 Bad Request

  Content-Type: text/html

HttpServer 10 Closed

HttpClient 10 Closed

..........................................................................

from /etc/make.conf:

PORTDIR_OVERLAY=/bkup/portage/server/local

http_proxy="http://192.168.1.3:8080"

RESUMECOMMAND=" /usr/bin/wget -t 5 --passive-ftp \${URI} -O \${DISTDIR}/\${FILE}

"

.........................................................................

 The server is behind a firewall..  I made it so port 80 is forwarded to our server.  Apache is running and i can access  the Apache test web page from off site.

 Does anyone have a suggestion on how the  400 Bad Request   error can be solved?

 Also do I need to have Apache running for H-R to work?

Thanks!

Rob

----------

## flybynite

I have a couple of things for you to try....

 *robfantini wrote:*   

> I think that I've followed the install instructions exactally...   But I get the following error when I try to  emerge a program from the server:
> 
> 14:25:07 ERROR 400: Bad Request.
> 
> 

 

This error didn't originate from http-replicator - it came directly from the upstream server, as seen in replicators log.

What happens when you try wget by itself? 

```

wget   http://gentoo.oregonstate.edu/distfiles/modutils-2.4.26.tar.bz2 

```

Do you get the same error?

Is there really a newline in your make.conf between {FILE} and the "?

 *Quote:*   

> 
> 
> RESUMECOMMAND=" /usr/bin/wget -t 5 --passive-ftp \${URI} -O \${DISTDIR}/\${FILE}
> 
> "
> ...

 

 *robfantini wrote:*   

> 
> 
>  The server is behind a firewall..  I made it so port 80 is forwarded to our server.  Apache is running and i can access  the Apache test web page from off site.
> 
> 

 

I don't understand why you mention this and what your trying to do here.  Nowhere in the howto does it mention apache nor port forwarding.  Http-replicator doesn't need apache running but won't bother it either.  It doesn't need port 80 forwarding.  You might be causing the problems with your forwarding/firewall setup.  But probably not if your just using a typical hardware router like linksys etc.

Setting the debug option in replicator will give more verbose output also.  Try that and post the output.

----------

## robfantini

What happens when you try wget by itself? 

```

wget   http://gentoo.oregonstate.edu/distfiles/modutils-2.4.26.tar.bz2 

```

 wget downloaded the file. 

 *Quote:*   

> 
> 
> Is there really a newline in your make.conf between {FILE} and the "?
> 
> 

 

  No there is not a newline. It loooked like there was one due to the font size i use in konqueror.

 *Quote:*   

> 
> 
>  Setting the debug option in replicator will give more verbose output also.  Try that and post the output.

 

  Ok I'll do that 

 *Quote:*   

> 
> 
> I don't understand why you mention this and what your trying to do here. Nowhere in the howto does it mention apache nor port forwarding. Http-replicator doesn't need apache running but won't bother it either. It doesn't need port 80 forwarding. You might be causing the problems with your forwarding/firewall setup. But probably not if your just using a typical hardware router like linksys etc. 
> 
> 

 

 I can see how that can be confusing... I did not have apache installed and thought it might be needed...  I'm not familiar with apache so it seems complex..  I'm glad to know now that H-P does not need apache or a web server running to work.

 Thanks for the suggestions.

----------

## robfantini

Debug is enabled in /etc/http-replicator.conf .

Here is output from an emerge.

Also I've posted my make.conf below.

It looks like http downloads are not working and ftp dowlloads are:

Calculating dependencies ...done!

[ebuild  N    ] sys-kernel/gentoo-dev-sources-2.6.8-r3  -build -doc -ultra1  34,936 kB

Total size of downloads: 34,936 kB

Do you want me to merge these packages? [Yes/No] yes

>>> emerge (1 of 1) sys-kernel/gentoo-dev-sources-2.6.8-r3 to /

>>> Downloading http://gentoo.oregonstate.edu/distfiles/genpatches-2.6-8.50-extras.tar.bz2

--08:01:28--  http://gentoo.oregonstate.edu/distfiles/genpatches-2.6-8.50-extras.tar.bz2

           => `/usr/portage/distfiles/genpatches-2.6-8.50-extras.tar.bz2'

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... 400 Bad Request

08:01:28 ERROR 400: Bad Request.

>>> Downloading http://mirror.datapipe.net/gentoo/distfiles/genpatches-2.6-8.50-extras.tar.bz2

--08:01:28--  http://mirror.datapipe.net/gentoo/distfiles/genpatches-2.6-8.50-extras.tar.bz2

           => `/usr/portage/distfiles/genpatches-2.6-8.50-extras.tar.bz2'

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... 400 Bad Request

08:01:28 ERROR 400: Bad Request.

>>> Downloading http://open-systems.ufl.edu/mirrors/gentoo/distfiles/genpatches-2.6-8.50-extras.tar.bz2

--08:01:28--  http://open-systems.ufl.edu/mirrors/gentoo/distfiles/genpatches-2.6-8.50-extras.tar.bz2

           => `/usr/portage/distfiles/genpatches-2.6-8.50-extras.tar.bz2'

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... 400 Bad Request

08:01:28 ERROR 400: Bad Request.

>>> Downloading ftp://ftp.gtlib.cc.gatech.edu/pub/gentoo/distfiles/genpatches-2.6-8.50-extras.tar.bz2

--08:01:28--  ftp://ftp.gtlib.cc.gatech.edu/pub/gentoo/distfiles/genpatches-2.6-8.50-extras.tar.bz2

           => `/usr/portage/distfiles/genpatches-2.6-8.50-extras.tar.bz2'

Resolving ftp.gtlib.cc.gatech.edu... 130.207.108.134

Connecting to ftp.gtlib.cc.gatech.edu[130.207.108.134]:21... connected.

Logging in as anonymous ... Logged in!

==> SYST ... done.    ==> PWD ... done.

==> TYPE I ... done.  ==> CWD /pub/gentoo/distfiles ... done.

==> PASV ... done.    ==> RETR genpatches-2.6-8.50-extras.tar.bz2 ... done.

    [    <=>                                ] 107,808      141.14K/s

08:01:47 (140.74 KB/s) - `/usr/portage/distfiles/genpatches-2.6-8.50-extras.tar.bz2' saved [107808]

>>> Downloading http://gentoo.oregonstate.edu/distfiles/linux-2.6.8.tar.bz2

--08:01:47--  http://gentoo.oregonstate.edu/distfiles/linux-2.6.8.tar.bz2

           => `/usr/portage/distfiles/linux-2.6.8.tar.bz2'

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... 400 Bad Request

08:01:47 ERROR 400: Bad Request.

>>> Downloading http://mirror.datapipe.net/gentoo/distfiles/linux-2.6.8.tar.bz2

--08:01:47--  http://mirror.datapipe.net/gentoo/distfiles/linux-2.6.8.tar.bz2

           => `/usr/portage/distfiles/linux-2.6.8.tar.bz2'

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... 400 Bad Request

08:01:47 ERROR 400: Bad Request.

>>> Downloading http://open-systems.ufl.edu/mirrors/gentoo/distfiles/linux-2.6.8.tar.bz2

--08:01:47--  http://open-systems.ufl.edu/mirrors/gentoo/distfiles/linux-2.6.8.tar.bz2

           => `/usr/portage/distfiles/linux-2.6.8.tar.bz2'

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... 400 Bad Request

08:01:47 ERROR 400: Bad Request.

>>> Downloading ftp://ftp.gtlib.cc.gatech.edu/pub/gentoo/distfiles/linux-2.6.8.tar.bz2

--08:01:47--  ftp://ftp.gtlib.cc.gatech.edu/pub/gentoo/distfiles/linux-2.6.8.tar.bz2

           => `/usr/portage/distfiles/linux-2.6.8.tar.bz2'

Resolving ftp.gtlib.cc.gatech.edu... 130.207.108.134

Connecting to ftp.gtlib.cc.gatech.edu[130.207.108.134]:21... connected.

Logging in as anonymous ... Logged in!

==> SYST ... done.    ==> PWD ... done.

==> TYPE I ... done.  ==> CWD /pub/gentoo/distfiles ... done.

==> PASV ... done.    ==> RETR linux-2.6.8.tar.bz2 ... done.

    [              <=>                      ] 7,514,544    156.19K/s           

make.conf:

 # fbc3 gentoo P4 server

# 2004-09-02 see https://forums.gentoo.org/viewtopic.php?t=173226

#            http://gentoo-wiki.com/HOWTO_Download_Cache_for_LAN-Http-Replicator

PORTDIR_OVERLAY=/bkup/portage/server/local

http_proxy="http://192.168.1.3:8080"

RESUMECOMMAND=" /usr/bin/wget -t 5 --passive-ftp \${URI} -O \${DISTDIR}/\${FILE}"

# 07-05-04 changed to /bkup

PKGDIR="/bkup/portage/server/packages"

CFLAGS="-march=i686 -mmmx -msse -O2 -pipe -fomit-frame-pointer"

CXXFLAGS="${CFLAGS}"

CHOST="i686-pc-linux-gnu"

# 08-20-04 added maildir. i noticed it excluded from postfix build.

# 2004-09-03 added  mysql apache2 pam ssl  xml xml2  for apache

# type env-update (return) to make your changes take effect!

USE=" mysql apache2 pam ssl xml xml2 maildir -X  -xv -xmms -xosd -trusted -usb -wx

windows -wavelan -pda -pcmcia -oss -opengl -mozilla -kde -gtkhtml -gtk2 -gtk -gpho

to2 -gnome -dga -3dfx -3dnow -arts nocardbus mmx cups "

# # hyperthreading cpu , use j3  xeon4 on fbc3

MAKEOPTS="-j2"

PORTAGE_NICENESS=15

FEATURES="buildpkg"

GENTOO_MIRRORS="http://mirror.datapipe.net/gentoo http://open-systems.ufl.edu/mirrors/gentoo ftp://ftp.gtlib.cc.gatech.edu/pub/gentoo"

----------

## flybynite

I didn't see anything in your make.conf.

The debug option will put more info in http-replicator's log.  Thats what I need to see.

The bad request just shouldn't happen.  I wonder if you downloaded the ebuild or opened the archive in windows?  If so, the script is probably corrupted.  You should download and reinstall http-replicator next.

----------

## robfantini

solved.

 My problem has nothing to do with  H-R .

 I stopped H-R  , tried to emerge a program and had the same issues. For some reason  emerge has errors when trying to wget from HTTP.

 Sorry about wasting your  time!  I'll get this fixed and then look forward to using your excellent program!

----------

## fzxdude

why not jsut nfs the distfiles dir on one of the machines ?

ya download the package once for all the machines with that share

----------

## robfantini

 *fzxdude wrote:*   

> why not jsut nfs the distfiles dir on one of the machines ?
> 
> ya download the package once for all the machines with that share

 

 I'm doing that now. The problem is we have 10+ linux pc's.  Using nfs works ok, but if 2 pc's tried to download  the same distfile at same time then there would be a problem.

 Using http-replicator would prevent this. 

 As I could not get H-R working,  I'll try using rsync daily to mirror all gentoo distfiles to my nfs directory.

----------

## fzxdude

ya we just have 7 here ... so its not that bad...

----------

## flybynite

First, let me say that if you own all other boxes nfs sharing is an option.  There are corruption and other issues you need to worry about but if you can control all the boxes then you can minimize the chance of corruption. 

I don't want to turn this thread into a flamewar about the potential corruption and other problems with sharing the distfile dir. I'd ask that someone start another thread to discuss that issue if anyone has any questions.

This thread is about the advantages of Http-Replicator!!

Http-Replicator is an easy and quick install that relieves the users of virtually all headaches associated with multiple boxen.  Once installed, there are no corruption or other issues to worry about.  Any box can emerge any package at any time and http-replicator will make sure the downloads happen in a safe and efficient manner!!

Think about this advantage in a situation like a college campus:  If you have 10 (or 100) avid gentooers that start a mozilla upgrade at nearly the same time what will happen with nfs sharing?

All 10 or (100) boxes will download a separate copy of the 34438800 byte file!!  Why?  Because each box will look for the package and will see an incomplete download and start a new download to resume the file from the already heavily loaded volunteer mirrors.

If Http-Replicator were is use, it would intercept all 10 (or 100) download requests from the LAN and would simultaneously download the file from the internet and stream the file to all 10 (or 100) users.  The result would be only 1 copy of the package downloaded from the internet and all LAN clients simultaneously receiving the package!!  Any client that requests the package after the download is complete will receive the file from the cache at LAN speeds!

The next question you have to ask yourself is what were you thinking when your tried to nfs share your distfile dir with 10 (or 100) semi-strangers on your college campus!!

----------

## meowsqueak

I agree totally. Http-replicator is the right tool for the job. Sharing over NFS is a horrible hack that will bite you eventually (but I did use it for a long time until I discovered HR).

----------

## MattSharp

Im sure someone has had this problem but I didn't see anything about it. When I try to emerge packages on the server, it has problems downloading the package:

 *Quote:*   

> 
> 
> lancelot root # emerge mutt         
> 
> Calculating dependencies ...done!
> ...

 

What is causing this?

----------

## flybynite

 *MattSharp wrote:*   

> Im sure someone has had this problem but I didn't see anything about it. When I try to emerge packages on the server, it has problems downloading the package:
> 
> 

 

Your connections are all on port 80, so your not downloading the package through http-replicator....  Are you trying to or is this a general gentoo problem your having?

----------

## MattSharp

 *flybynite wrote:*   

>  *MattSharp wrote:*   Im sure someone has had this problem but I didn't see anything about it. When I try to emerge packages on the server, it has problems downloading the package:
> 
>  
> 
> Your connections are all on port 80, so your not downloading the package through http-replicator....  Are you trying to or is this a general gentoo problem your having?

 

I am trying to use http-replicator. This is the "server" box. The clients seem to work fine...I think.

But when I try to get a package on there I have that problem. I commented out the http_proxy line cause that didn't work either. Here is what I get when I try it with that line:

 *Quote:*   

> 
> 
> Calculating dependencies ...done!
> 
> >>> emerge (1 of 1) mail-client/mutt-1.5.6-r3 to /
> ...

 

So what am I doing wrong? One thing I did notice. If I try to emerge mutt on one of the "clients" it downloads it from the "server" and works fine. But I can't emerge it on the server. What is the problem?

Also, one scenario I also have is that one of my clients is a desktop and it's not always on the network. Is there an easy way to make it not try to use the server when its no on the network? Maybe comment something out or something? Or maybe write into it, that if the server is missing try something else?

----------

## flybynite

 *MattSharp wrote:*   

> I am trying to use http-replicator.
> 
> ...
> 
>  I commented out the http_proxy line 
> ...

 

If your using the standard cache and distfile locations, try this:

```

mv /var/cache/http-replicator/*  /usr/portage/distfiles/

/usr/bin/repcacheman

rm /usr/portage/distfiles/*

```

 *MattSharp wrote:*   

> 
> 
> Also, one scenario I also have is that one of my clients is a desktop and it's not always on the network. Is there an easy way to make it not try to use the server when its no on the network?

 

What I do is use Quickswitch for my laptop (emerge quickswitch).  Quickswitch is designed to allow multiple configs for different networks with different services.  I have a setting that switches my /etc/make.conf when I move off my lan.

More info here: http://www.newsforge.com/article.pl?sid=01/12/22/2118213&mode=thread

Homepage http://muthanna.com/quickswitch

How to change on boot with gentoo https://forums.gentoo.org/viewtopic.php?t=96281&highlight=quickswitch+boot

----------

## Master One

That's so great! Also I was fighting with some strange problem, now it seems to be working just fine.

Nevertheless someone please tell me: *Quote:*   

> Don't forget that portage needs mirrors! Edit GENTOO_MIRRORS in /etc/make.conf to add more http mirrors and place any ftp mirrors LAST. The default mirrors in gentoo leave something to be desired Smile Use mirrorselect if you need help in selecting mirrors. 

 

Is GENTOO_MIRRORS to be set on all machines (server+workstations), or only on the server???

 *Quote:*   

> Also, some packages in portage have a RESTRICT="nomirror" option which will prevent portage from checking replicator for those packages. The following will override this behavior. Create the file "/etc/portage/mirrors" containing: local http://gentoo.oregonstate.edu

 

Is this mirrors file to be created on all machines (server+workstations), or only on the server???

I find this mirrors thing a little confusing, because why should the workstations bother about that, if only the server is intended to download and cache the distfiles.

My last question: Why is nothing showing up in the http-replicator log-file??? I can see the file gets generated on startup, but until now I could never find any info in there, also I did some successfull downloads (indeed the file has always a size of zero).

----------

## meowsqueak

The mirrors need to be specified for every user of the proxy, since a request is sent to the proxy for a file on a particular mirror, and the proxy 'does it's thing' - it's meant to be fairly transparent to the client. I use the same set of mirrors for every node in the network, including the server.

The clients need the mirrors because the server allows the client to download the file 'by proxy'.

As for the log file - check the permissions are OK.

----------

## Master One

 *meowsqueak wrote:*   

> The mirrors need to be specified for every user of the proxy, since a request is sent to the proxy for a file on a particular mirror, and the proxy 'does it's thing' - it's meant to be fairly transparent to the client. I use the same set of mirrors for every node in the network, including the server. The clients need the mirrors because the server allows the client to download the file 'by proxy'. As for the log file - check the permissions are OK.

 

Ok, I've set GENTOO_MIRRORS and /etc/portage/mirrors now on all machines (server + workstations).

It's definitely working fine, but I still could not find out, why the logfile keeps beeing empty. I tried it with chown root:root and portage:portage, but this does not change anything (in http-replicator.conf there is explicitely mentioned, that no write access has to be granted anyway). The logfile definitely gets created on startup, if I delete it before, but until now, nothing showed up in there...   :Confused: 

----------

## flybynite

 *Master One wrote:*   

> 
> 
> It's definitely working fine, but I still could not find out, why the logfile keeps beeing empty.

 

The stable version of http-replicator writes the log file in a very lazy way.  It won't actually write the log entries to disk till you fill the buffer.  It uses less system resources this way on a busy server but it can take a while to flush to disk on a home lan  :Smile:   This has been changed in the latest version of http-replicator which flushes every log entry to disk.

So if your just trying to make sure your setup is OK, it probably is.  If you want to see the logs now, just use the telnet monitor and you can see the messages as they happen!  Check your config to make sure it is enabled and then just telnet to the port to see any messages in real time!

----------

## Master One

Thanx, flybynite, as everthing is working, I just wanted to check, if the log confirms that. The less resources it takes, the better. Hopefully it will make it into portage soon now, so that I can benefit from the ongoing development.

----------

## jdoe

hi..

i'm trying to set up http-rep, for my 3 client lan.. I followed the how-to, but i have some problem on client side.

On the server everything works fine, i can see downloading at 15MB/s a cached file.

On client side i have this error:

```
Calculating dependencies ...done!

>>> emerge (1 of 1) net-fs/samba-3.0.7-r1 to /

>>> Resuming download...

>>> Downloading http://samba.idealx.org/dist/smbldap-tools-0.8.5.tgz

--17:39:32--  http://samba.idealx.org/dist/smbldap-tools-0.8.5.tgz

           => `/usr/portage/distfiles/distfiles/smbldap-tools-0.8.5.tgz'

Resolving slave... 192.168.1.3

Connecting to slave[192.168.1.3]:12000... connected.

Proxy request sent, awaiting response... 503 Service Unavailable

17:39:52 ERROR 503: Service Unavailable.

!!! Couldn't download smbldap-tools-0.8.5.tgz. Aborting.
```

here is conf files:

client make.conf:

```
GENTOO_MIRRORS="http://www.die.unipd.it/pub/Linux/distributions/gentoo-sources/ http://sunsite.cnlab-switch.ch/ftp/mirror/gentoo/"

http_proxy="http://slave:12000"

RESUMECOMMAND="/usr/bin/wget -t 5 --passive-ftp \${URI} -O \${DISTDIR}/\${FILE}"

```

server make.conf

```

http_proxy="http://127.0.0.1:12000"

RESUMECOMMAND="/usr/bin/wget -t 5 --passive-ftp \${URI} -O \${DISTDIR}/\${FILE}"

GENTOO_MIRRORS="http://www.die.unipd.it/pub/Linux/distributions/gentoo-sources/ http://sunsite.cnlab-switch.ch/ftp/mirror/gentoo/"

```

/etc/http-replicator.conf

```

PORT = 12000

IP = ['127.0.0.1','192.168.1.2','192.168.1.3']

```

where i wrong?

Thanks, jdoe

----------

## lisa

For distfile sharing I've set up Samba.  It's way easier.  For Rsync I have my server set up as a rsync mirror for all of my internal Gentoo machines.

----------

## BlinkEye

 *jdoe wrote:*   

> server make.conf
> 
> ```
> 
> http_proxy="http://127.0.0.1:12000"
> ...

 

change your http_proxy="http://127.0.0.1:12000" to either your external ip or to 192.168.1.3 (which would be your lan ip), i.e. to http_proxy="http://192.168.1.3:12000

----------

## meowsqueak

 *lisa wrote:*   

> For distfile sharing I've set up Samba.

 

That doesn't solve the concurrency problem either, does it?

----------

## jdoe

 *BlinkEye wrote:*   

> 
> 
> change your http_proxy="http://127.0.0.1:12000" to either your external ip or to 192.168.1.3 (which would be your lan ip), i.e. to http_proxy="http://192.168.1.3:12000

 

thanks, it works  :Smile: 

It' really a nice tool...

----------

## fritz

it's working but cpu usage is ~25% continuously on server while downloading with single client

doesnt' matter if the file is already in the cache or not

think this is rather high for just downloading, am i the only one with this problem or is this lack of effiency in http-replicator?

cpu is:

model name      : AMD Duron(tm) Processor

cpu MHz         : 1016.568

----------

## flybynite

 *fritz wrote:*   

> 
> 
> think this is rather high for just downloading, am i the only one with this problem or is this lack of effiency in http-replicator?
> 
> 

 

I have an even slower box and I don't see cpu that high while maxing out a 100Mbs lan.  cpu usage is dependent on many factors though.  http-replicator is usually waiting on your disk or network.  If I had to guess I'd say that you need to check your network config - especially your network card module or driver.  I've heard of some drivers that have two versions that work but with high cpu usage on one.  Could also be as simple as you don't have dma on your disks.

Start checking by transferring some data on your lan with ftp or http and watching the cpu usage.

----------

## fritz

indeed transfering a large file through nfs also takes 10-15% cpu usage, which also seems rather high. dma is on though, and i'm using the sis900 driver, any issues with that one?

hmm it's an onboard-nic, maybe that got something to do with it?

cheers

----------

## flybynite

 *fritz wrote:*   

> i'm using the sis900 driver, any issues with that one?
> 
> 

 

A quick google shows a few possibilities that may or may not apply to your exact motherboard/kernel combination.

Most point towards 2.6.X problems with that chipset while 2.4.X works better.  The problem seems to be sharing pci irq's which shows up in high demand usage of both the disk and the network driver.  Try copying a large file to another disk or to /dev/null and see it the cpu is still high.

Some fixes I've seen recommend making sure to disable the Plug and Play support in the motherboard BIOS.

Check you dmesg for errors also.

Sorry I couldn't give you an easy fix.....

----------

## fritz

using 2.6.8-gentoo, no errors in dmesg and the sis900 gets it own irq. what's weird is cat'ing a file to /dev/null takes 15% usage, using dd 30%  :Shocked:  (prob not related to my problem, just weird). is this too high because i get similar results on another computer?

appreciate the help and the fast reply   :Smile: 

----------

## flybynite

Those results would seem to say your level of cpu usage is normal to your box and http-replicator is very efficient, using the same cpu as cat'ing to /dev/null.....

----------

## fritz

 *flybynite wrote:*   

> Those results would seem to say your level of cpu usage is normal to your box and http-replicator is very efficient, using the same cpu as cat'ing to /dev/null.....

 

ooooooh   :Embarassed:   indeed, see your point

guess i'll just accept it takes 25% of my cpu then, i'll try to play a bit with hdparm

----------

## piyo

The instructions say nothing about what to clean periodically in the cache directory.

Here I delete anything that hasn't been referenced in the last two weeks.

```
colinux root # cat /etc/cron.daily/http-replicator

#!/bin/sh

/bin/nice /usr/bin/find /var/cache/http-replicator -type f -ctime +14 | \

    /usr/bin/xargs --no-run-if-empty /usr/bin/rm -f

/bin/nice /usr/bin/repcacheman

```

---

piyo

----------

## flybynite

 *piyo wrote:*   

> The instructions say nothing about what to clean periodically in the cache directory.
> 
> 

 

Thanks for the script piyo!

I've talked about this before in the thread.  The truth is it is much harder than it looks.  That's the reason gentoo doesn't have a default way of cleaning distfiles either,  many people need different things.

Most scripts in this board that clean distfiles will work with http-replicators cache.  You can find scripts to clean the cache based on any number of criteria.  

Your script is nicely done.  I like the way you nice'd the tasks and you remembered to still run repcacheman.

The one problem I see with your script is that many gentooers have followed the Gentoo Install Handbook's recommendation of using `noatime` in fstab for performance reasons. 

 *Quote:*   

> 
> 
> Now, to improve performance, most users would want to add the noatime option as mountoption, which results in a faster system since access times aren't registered (you don't need those generally anyway):
> 
> 

 

 From http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=1&chap=8#doc_chap1

With noatime as a mount option, your script would simply delete files older than two weeks regardless of how recently they were served to clients.......

----------

## piyo

 *flybynite wrote:*   

> The one problem I see with your script is that many gentooers have followed the Gentoo Install Handbook's recommendation of using `noatime` in fstab for performance reasons. 
> 
> ...
> 
> With noatime as a mount option, your script would simply delete files older than two weeks regardless of how recently they were served to clients.......

 

Whoa, thanks for the correction! I was wondering why the script wasn't really working. I'm one of those gentooers! I'll be sure to delete noatime from my /etc/fstab because I really want atime. How else can you get the access time and changed time unless you modify your script to output those values?

---

piyo

----------

## flybynite

 *piyo wrote:*   

> 
> 
> Whoa, thanks for the correction! I was wondering why the script wasn't really working. I'm one of those gentooers! I'll be sure to delete noatime from my /etc/fstab because I really want atime. 

 

I would try and leave noatime.  Without it, every read becomes a disk write which invalidates the disk cache...

 *piyo wrote:*   

> 
> 
> How else can you get the access time and changed time unless you modify your script to output those values?
> 
> 

 

Http-replicator saves a log of every request received from clients.  The log entries were designed for easy parsing to generate stats on the cache usage and efficiency.  Those stats would show when the last time a file was requested by a client.

The stats generation should be in the next version of http-replicator.  Until then, you could try and parse the log yourself.  This is an actual log entry from my box:

```

10 Oct 2004 02:53:37 INFO: HttpClient 75 received request for http://gentoo.osuosl.org/distfiles/samba-3.0.7.tar.gz

```

There are different levels of logging with various levels of info logged. See the config file....

----------

## piyo

 *flybynite wrote:*   

> I would try and leave noatime.  Without it, every read becomes a disk write which invalidates the disk cache...

 

So you prefer speediness over correctness?  Premature optimization?  :Wink: 

Admittedly it's probably true that most people don't use last accessed time in Linux and turning it on for a whole drive when you only want to cover the cache on probably hinders performance. However in my normal computer experience I don't feel the difference, in my Linux or Windows sessions.

Let the system bring you features and take advantage of them. That's what the operating system is for.

---

piyo

----------

## Bob Shroom

hi guys,

i am having a problem with one of my clients, who is supposed to use the replicator proxy... but for some strange reasons doesn't want to do so.

i've checked my settings in /etc/make.conf now for several times, even started with an entirely new /etc/make.conf.orig... and i can't get the damn client to use the http-proxy. i'm about to give up on this.

i think my server is setup correct, as another client in my LAN does connect to it and also uses it just fine.

the /etc/replicator.conf on the server:

```

server root # cat /etc/http-replicator.conf 

#   ************README-Gentoo Http-Replicator *******************

#   The defaults in Http-Replicator have been changed to work with the

#   default Gentoo install and shouldn't have to be changed.  The only 

#   changes required to activate Http-Replicator are in /etc/make.conf

#   on the clients and the server itself.

#

#   Find the Default fetch command section in /etc/make.conf.  If you are

#   already using one of the alternate fetch commands, apply the changes

#   to your section.  Http-Replicator does not support "continued" downloads.

# 

#   Make the following changes:

#

# 1. Add http_proxy="http://YourProxyHere.com:8080" Line

#       replacing YourProxyHere.com with your proxy hostname or IP address.

# 2. Uncomment (remove the leading '#') from RESUMECOMMAND

# 3. Remove the -c from the RESUMECOMMAND and replace -P \${DISTDIR} with -O \${DISTDIR}/\${FILE}

#

#   It will look like this when complete:

#

#   Default fetch command (5 tries, passive ftp for firewall compatibility)

#   http_proxy="http://YourMirrorHere.com:8080"

#   #FETCHCOMMAND="$PROXY /usr/bin/wget -t 5 --passive-ftp \${URI} -P \${DISTDIR}"

#   RESUMECOMMAND="$PROXY /usr/bin/wget -t 5 --passive-ftp \${URI} -O \${DISTDIR}/\${FILE}"

#

#   execute:

#   /etc/init.d/http-replicator start

#   to run http-replicator.

#   execute:

#   rc-update add http-replicator default

#   to make http-replicator start at boot

#   execute:

#   /usr/bin/repcacheman

#   frequently (/etc/cron.daily) to delete

#   dup files and add new files to the cache

#

#   ************END README-Gentoo Http-Replicator *******************

#  This is the configuration file for the replicator proxy server.

#  Settings from this file will apply to the server in daemon mode and also to the cache cleaner script, if used.

#

#  The proxy server will act on port [PORT] of the localhost.

#  The default value of 8080 is a common value for an http proxy server.

#  If you have another http proxy running it is likely that you should should change this port.

PORT = 12000

#  Replicator's behaviour can be altered through a number of server flags.

#  They are by default disabled; [FLAGS] is a list of flags that should be enabled.

#  The possible flags are the same as on the command line:

#  * static: never check for modifications

#  * flat: save files in a single directory

#  * debug: crash on exceptions

FLAGS = ['static','flat','debug']

#  For security reasons the hosts for which access to the proxy is granted should be specified in the [IP] list.

#  A '?' can be used as wildcard for a single digit and a '*' for a multiple digits.

#  For example '10.0.?.*' grants access from 10.0.1.25 but not from 10.0.15.25.

IP = ['127.0.0.1','192.168.4.10','192.168.4.20','192.168.4.30']

#  The proxy server can be monitored via telnet on port [TELNET].

#  This is disabled by entering a zero value.

#  Otherwise make sure the port is available or replicator will not start.

TELNET = 0

#  The process user id is set to [USER].

#  The daemon must be started as root because no other user can change into another.

#  Not even [USER] can change into itself!

USER = 'portage'

#  All cached files ar saved in directory [DIR].

#  The [USER] should of course have write permission in this directory.

#  Where in this directory the files are actually put depends on if the server is in flat mode.

#  By default the entire directory structure is copied.

DIR = '/var/cache/http-replicator'

#  The process id of the running process is saved in [PID].

#  As this is done before changing into [USER], write permission for [USER] for this file is not needed.

PID = '/var/run/http-replicator.pid'

#  All messages on stdout and stderr are in daemon mode written to the [LOG].

#  Just as for [PID], write permission for [LOG] is not necessary.

LOG = '/var/log/http-replicator'

#  When the proxy server is used to maintain a package cache a cron script can delete the oldest packages.

#  The value of [KEEP] sets the maximum number of versions of each package to be kept.

#  For example a value of one will delete all versions but the most recent.

#  The script is disabled by setting this value to zero.

#  Not implemented in Gentoo yet!

KEEP = 2

```

scenario:

192.168.4.10 = server

192.168.4.20 = client (BAD)

192.168.4.30 = client (GOOD)

the good(=working) client obviously uses the http_proxy-setting in /etc/make.conf

```

good client root # emerge -v netcat

>>> Downloading http://gentoo.oregonstate.edu/distfiles/nc-v6-20000918.patch.gz

--10:25:55--  http://gentoo.oregonstate.edu/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

Verbindungsaufbau zu 192.168.4.10:12000... verbunden.

Proxy Anforderung gesendet, warte auf Antwort... 200 OK

Länge: 8,740

100%[=============================================================================================================================================>] 8,740         --.--K/s

10:25:55 (16.31 MB/s) - »/usr/portage/distfiles/nc-v6-20000918.patch.gz« gespeichert [8740/8740]

>>> Downloading http://gentoo.oregonstate.edu/distfiles/nc110.tgz

--10:25:55--  http://gentoo.oregonstate.edu/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

Verbindungsaufbau zu 192.168.4.10:12000... verbunden.

Proxy Anforderung gesendet, warte auf Antwort... 200 OK

Länge: 75,267

100%[=============================================================================================================================================>] 75,267        --.--K/s

10:25:55 (9.55 MB/s) - »/usr/portage/distfiles/nc110.tgz« gespeichert [75267/75267]

```

eventho it's working on this client, i've noticed that it wants to use the official mirrors, what i cannot understand, cause i've set the GENTOO_MIRRORS in /etc/make.conf to just use only http://-Mirrors here in germany.

```

good client root # cat /etc/make.conf | grep GENTOO_MIRRORS

# Portage uses GENTOO_MIRRORS to specify mirrors to use for source retrieval.

GENTOO_MIRRORS="http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/ http://ftp.uni-erlangen.de/pub/mirrors/gentoo http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/ http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/"

```

on the bad(=non-working client) it looks like this:

/etc/make.conf of the bad client:

```

GENTOO_MIRRORS="http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/ http:

//ftp.uni-erlangen.de/pub/mirrors/gentoo http://mirrors.sec.informatik.tu-darmst

adt.de/gentoo/ http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/"

# Default fetch command (5 tries, passive ftp for firewall compatibility)

http_proxy="http://192.168.4.10:12000"

#FETCHCOMMAND="/usr/bin/wget -t 5  --passive-ftp \${URI} -P \${DISTDIR}"

RESUMECOMMAND=" /usr/bin/wget -t 5 --passive-ftp  \${URI} -O \${DISTDIR}/\${FILE}" 

```

but when i try to emerge something, i get this:

```

bad client ~ # emerge netcat

>>> Downloading http://gentoo.oregonstate.edu/distfiles/nc-v6-20000918.patch.gz

--11:14:37--  http://gentoo.oregonstate.edu/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

>>> Downloading http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc-v6-20000918.patch.gz

--11:14:37--  http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

>>> Downloading http://ftp.uni-erlangen.de/pub/mirrors/gentoo/distfiles/nc-v6-20000918.patch.gz

--11:14:37--  http://ftp.uni-erlangen.de/pub/mirrors/gentoo/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

>>> Downloading http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/distfiles/nc-v6-20000918.patch.gz

--11:14:37--  http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

>>> Downloading http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/distfiles/nc-v6-20000918.patch.gz

--11:14:37--  http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

>>> Downloading ftp://sith.mimuw.edu.pl/pub/users/baggins/IPv6/nc-v6-20000918.patch.gz

--11:14:37--  ftp://sith.mimuw.edu.pl/pub/users/baggins/IPv6/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

Auflösen des Hostnamen »sith.mimuw.edu.pl«.... 193.0.96.4

Verbindungsaufbau zu sith.mimuw.edu.pl[193.0.96.4]:21... verbunden.

Anmelden als anonymous ... Angemeldet!

==> SYST ... fertig.    ==> PWD ... fertig.

==> TYPE I ... fertig.  ==> CWD /pub/users/baggins/IPv6 ... fertig.

==> PASV ... fertig.    ==> RETR nc-v6-20000918.patch.gz ... fertig.

Länge: 8,740 (unmaßgeblich)

100%[=============================================================================================================================================>] 8,740         --.--K/s             

11:14:38 (99.17 KB/s) - »/usr/portage/distfiles/nc-v6-20000918.patch.gz« gespeichert [8740]

>>> Downloading http://gentoo.oregonstate.edu/distfiles/nc110.tgz

--11:14:38--  http://gentoo.oregonstate.edu/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc110.tgz

--11:14:38--  http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://ftp.uni-erlangen.de/pub/mirrors/gentoo/distfiles/nc110.tgz

--11:14:38--  http://ftp.uni-erlangen.de/pub/mirrors/gentoo/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/distfiles/nc110.tgz

--11:14:38--  http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/distfiles/nc110.tgz

--11:14:38--  http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://www.atstake.com/research/tools/network_utilities/nc110.tgz

--11:14:38--  http://www.atstake.com/research/tools/network_utilities/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

!!! Couldn't download nc110.tgz. Aborting.

```

it doesn't seem to recognize the http_proxy at all in the above.

i saw, that somebody before me had a similar problem, but that was on the server and not on the client. and unlike him... when i comment out my the http_proxy line in /etc/make.conf portage seems to be able to download files:

```

bad client ~ # emerge netcat

>>> Downloading http://gentoo.oregonstate.edu/distfiles/nc-v6-20000918.patch.gz

--11:21:29--  http://gentoo.oregonstate.edu/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

Auflösen des Hostnamen »gentoo.oregonstate.edu«.... 140.211.166.134

Verbindungsaufbau zu gentoo.oregonstate.edu[140.211.166.134]:80... verbunden.

HTTP Anforderung gesendet, warte auf Antwort... 301 Moved Permanently

Platz: http://gentoo.osuosl.org/distfiles/nc-v6-20000918.patch.gz[folge]

--11:21:30--  http://gentoo.osuosl.org/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

Auflösen des Hostnamen »gentoo.osuosl.org«.... 140.211.166.134

Verbindungsaufbau zu gentoo.osuosl.org[140.211.166.134]:80... verbunden.

HTTP Anforderung gesendet, warte auf Antwort... 200 OK

Länge: 8,740 [application/x-gzip]

100%[=============================================================================================================================================>] 8,740         43.10K/s             

11:21:30 (42.96 KB/s) - »/usr/portage/distfiles/nc-v6-20000918.patch.gz« gespeichert [8740/8740]

>>> Downloading http://gentoo.oregonstate.edu/distfiles/nc110.tgz

--11:21:30--  http://gentoo.oregonstate.edu/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

Auflösen des Hostnamen »gentoo.oregonstate.edu«.... 140.211.166.134

Verbindungsaufbau zu gentoo.oregonstate.edu[140.211.166.134]:80... verbunden.

HTTP Anforderung gesendet, warte auf Antwort... 301 Moved Permanently

Platz: http://gentoo.osuosl.org/distfiles/nc110.tgz[folge]

--11:21:31--  http://gentoo.osuosl.org/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

Auflösen des Hostnamen »gentoo.osuosl.org«.... 140.211.166.134

Verbindungsaufbau zu gentoo.osuosl.org[140.211.166.134]:80... verbunden.

HTTP Anforderung gesendet, warte auf Antwort... 200 OK

Länge: 75,267 [application/x-tar]

100%[=============================================================================================================================================>] 75,267        97.18K/s             

11:21:32 (96.90 KB/s) - »/usr/portage/distfiles/nc110.tgz« gespeichert [75267/75267]

```

also here from the "official" mirrors first, before it finds my GENTOO_MIRRORS in /etc/make.conf

my /etc/portage/mirrors looks like this: (on server and both clients)

```

# Http-Replicator Override for FTP and RESTRICT="nomirror packages

local http://gentoo.oregonstate.edu 

```

so, anybody got an idea why this is not working for me on the one client?

is there any setting that overrides my GENTOO_MIRRORS in /etc/make.conf ?

and why doesn't portage recognize the http_proxy line in /etc/make.conf ?

the only difference between the two clients is the architecture and the keywords. the working client is a PPC (stable) and the non-working is a X86 (unstable)... but all that shouldnt make a difference, no?

thanks in advance for any tip pointing me into the right direction.  :Wink: 

.bob

----------

## flybynite

 *Bob Shroom wrote:*   

> 
> 
> i think my server is setup correct, as another client in my LAN does connect to it and also uses it just fine.
> 
> 

 

Ok, your server is working....

 *Bob Shroom wrote:*   

> 
> 
> my /etc/portage/mirrors looks like this: (on server and both clients)
> 
> ```
> ...

 

Yes, the local mirror in /etc/portage/mirrors is always checked first.  Change it from gentoo.oregonstate.edu to your preferred german mirror.

 *Bob Shroom wrote:*   

> 
> 
> and why doesn't portage recognize the http_proxy line in /etc/make.conf ?
> 
> 

 

Let's check a couple of things.  

1.  In your server config you only allow specific IP's - are you sure the IP of the bad box didn't change through dhcpd and is now being denied?  

You may be using static IP's, I don't know....  But, unless you have a good reason - leave the IP set to the range of your lan probably '192.168.4.*'  Security is good but can be frustrating when you have a mistake in your config....

If your bad client is being denied or not even reaching your server due to routing issues it will show in  /var/log/http-replicator log on the server.  Try posting a  couple of log entries showing the good and the bad box trying to connect.

2.  You can test that your http_proxy is set several ways.   Try these tests on the bad client...

```

source /etc/make.conf

echo $http_proxy

```

Does this print "http://192.168.4.10:12000" ?

```

http_proxy=http://192.168.4.10:12000 wget http://gentoo.oregonstate.edu/distfiles/nc110.tgz

```

Does wget contact http-replicator?

How about this one:

```

http_proxy=http://192.168.4.10:12000 emerge netcat

```

Do you still get the same errors?

If this solves the problem, the http_proxy isn't set in the bad client make.conf - probably an extra " or ' messing up the settings.  The extra " or ' could be several lines above the line with your http_proxy setting...

I know we can get http-replicator working for you!!!

----------

## Bob Shroom

hi, thanx for your reply

let's see what we got here:

 *Quote:*   

> Yes, the local mirror in /etc/portage/mirrors is always checked first. Change it from gentoo.oregonstate.edu to your preferred german mirror. 

 

changed it to my fav german mirror

```

# Http-Replicator Override for FTP and RESTRICT="nomirror packages

local http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/

```

 *Quote:*   

> 1. In your server config you only allow specific IP's - are you sure the IP of the bad box didn't change through dhcpd and is now being denied? 

 

i dont use DHCP ... only static IPs

```

bad client ~ # ifconfig eth0

eth0      Protokoll:Ethernet  Hardware Adresse 00:50:04:45:BE:19  

          inet Adresse:192.168.4.20  Bcast:192.168.4.255  Maske:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

          Kollisionen:0 Sendewarteschlangenlänge:1000 

          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

          Interrupt:11 Basisadresse:0xb400 

```

 *Quote:*   

> You may be using static IP's, I don't know.... But, unless you have a good reason - leave the IP set to the range of your lan probably '192.168.4.*' Security is good but can be frustrating when you have a mistake in your config.... 

 

i understand... eventhough i am sure this is not the problem, i modified the server.conf:

```

server root # cat /etc/http-replicator.conf | grep 192

IP = ['127.0.0.1','192.168.4.*']

```

 *Quote:*   

> If your bad client is being denied or not even reaching your server due to routing issues it will show in /var/log/http-replicator log on the server. Try posting a couple of log entries showing the good and the bad box trying to connect.

 

well... the bad client definitely doesn't reach the proxy. this shouldn't be a routing issue tho. when i try to emerge something it goes straight for the server in /etc/portage/mirror and then the GENTOO_MIRRORS in /etc/make.conf . it doesn't even try to connect to the proxy. i see that in the emerge output and on my switch-LEDs.

i was tailing /var/log/http-replicator while i emerged something from the working client. unfortunately nothing was logged.

i was also greping thru /var/log/http-replicator.old and found out that only the server itself shows up in the logfile, when i emerge packages. (server=192.168.4.10 uses proxy AND shows in logfile , good client=192.168.4.30 uses proxy but doesn't show up in logfile , bad client=192.168.4.20 doesn't use proxy and of course doesn't show up in its logfile)

 *Quote:*   

> 2. You can test that your http_proxy is set several ways. Try these tests on the bad client...
> 
> ```
> 
> source /etc/make.conf
> ...

 

yes it does.

```

bad client ~ # source /etc/make.conf

bad client ~ # echo $http_proxy

http://192.168.4.10:12000

```

 *Quote:*   

> 
> 
> ```
> 
> http_proxy=http://192.168.4.10:12000 wget http://gentoo.oregonstate.edu/distfiles/nc110.tgz
> ...

 

no... it aborts.(=Abgebrochen.)

this is all i get:

```

bad client ~ # http_proxy=http://192.168.4.10:12000 wget http://gentoo.oregonstate.edu/distfiles/nc110.tgz 

--10:42:52--  http://gentoo.oregonstate.edu/distfiles/nc110.tgz

           => `nc110.tgz'

Abgebrochen

```

 *Quote:*   

> How about this one:
> 
> ```
> 
> http_proxy=http://192.168.4.10:12000 emerge netcat
> ...

 

yes... unfortunately that didn't do the trick:  :Sad: 

```

bad client ~ # http_proxy=http://192.168.4.10:12000 emerge netcat 

Calculating dependencies ...done!

>>> emerge (1 of 1) net-analyzer/netcat-110-r6 to /

>>> Downloading http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror//distfiles/nc110.tgz

--10:52:10--  http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc110.tgz

--10:52:10--  http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://ftp.uni-erlangen.de/pub/mirrors/gentoo/distfiles/nc110.tgz

--10:52:10--  http://ftp.uni-erlangen.de/pub/mirrors/gentoo/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/distfiles/nc110.tgz

--10:52:10--  http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/distfiles/nc110.tgz

--10:52:10--  http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://www.atstake.com/research/tools/network_utilities/nc110.tgz

--10:52:10--  http://www.atstake.com/research/tools/network_utilities/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

!!! Couldn't download nc110.tgz. Aborting.

```

i suppose my portage cannot be messed up, cause when i comment out the http_proxy line in /etc/make.conf it downloads the file and merges it.

 *Quote:*   

> I know we can get http-replicator working for you!!!

 

yeah, that would be sweet.  :Smile: 

i followed your rsync howto and it worked like a charm. replicator worked like a charm too until i started setting up the bad client.

greets,

.bob

----------

## flybynite

 *Bob Shroom wrote:*   

> 
> 
> well... the bad client definitely doesn't reach the proxy. this shouldn't be a routing issue tho. ...
> 
>  ...it doesn't even try to connect to the proxy. i see that in the emerge output and on my switch-LEDs.
> ...

 

I believe this is telling you that it is a routing problem!

1.  We know http-replicator works because another box connects and gets excellent download speeds.

2.  We know that wget itself can't connect to http-replicator from the bad box.  The wget test we tried bypasses any config errors or portage problems and should have worked independant of any other programs/setting/configs etc.

3.  We know wget itself works because it can connect to a working server on the internet and complete a download.  It just can't connect to a working server on your lan.

This only leaves routing and firewall issues!!!

The first issue is replicator itself has a security feature that limits IP's that can connect to it.  You changed the list to include your whole lan.  I didn't specifically mention you should restart http-replicator after you changed the config.  We both don't think was the problem,  but,  If you didn't do the restart do it and check the previous tests just to make sure....

I've looked again at your error's and they stop at the same place - name/IP resolution...

Here is what I get when I try the same test you did:

```

 $ http_proxy=http://192.168.4.10:12000 wget http://gentoo.oregonstate.edu/distfiles/nc110.tgz

--12:33:48--  http://gentoo.oregonstate.edu/distfiles/nc110.tgz

           => `nc110.tgz'

Connecting to 192.168.4.10:12000...

```

Compare this to your error:

```

bad client ~ # http_proxy=http://192.168.4.10:12000 wget http://gentoo.oregonstate.edu/distfiles/nc110.tgz

--10:42:52--  http://gentoo.oregonstate.edu/distfiles/nc110.tgz

           => `nc110.tgz'

Abgebrochen

```

Your proxy IP isn't even on my network and won't work of course.  But notice where wget stopped on my box compared with your box.

On my box wget shows that it parsed the request and then did a name/ip resolution lookup and then tried to connect to that destination and then just quit.  Your box seems to have failed at the name/IP lookup and never tried to connect to the destination.  Thats why you never saw the lights on your router because wget never tried to connect.  

The question is why did wget fail at that step?  Since we used IP's there is no name resolution going on.  It just needs to lookup the destination in the routing table.

Let's check the network.

What do you get when you try to ping the proxy?

```

ping -c 3 192.168.4.10

```

Show the route of the good box and the bad box - you must be root to do this...

```

route

```

I assume you have a normal lan?  All the boxes are plugged into a plain router?

----------

## Bob Shroom

 *Quote:*   

> I didn't specifically mention you should restart http-replicator after you changed the config. We both don't think was the problem, but, If you didn't do the restart do it and check the previous tests just to make sure.... 

 

ok, did that...just to make sure...

```
server root # /etc/init.d/http-replicator stop  

 * Stopping Http-Replicator...                                            [ ok ]

server root # /etc/init.d/http-replicator start

 * Starting Http-Replicator...                                            [ ok ]

server root # 

```

...but that didn't do the trick.

```

bad client ~ # emerge netcat

Calculating dependencies ...done!

>>> emerge (1 of 1) net-analyzer/netcat-110-r6 to /

>>> Downloading http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror//distfiles/nc-v6-20000918.patch.gz

--17:30:18--  http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

>>> Downloading http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc-v6-20000918.patch.gz

--17:30:18--  http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

>>> Downloading http://ftp.uni-erlangen.de/pub/mirrors/gentoo/distfiles/nc-v6-20000918.patch.gz

--17:30:18--  http://ftp.uni-erlangen.de/pub/mirrors/gentoo/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

>>> Downloading http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/distfiles/nc-v6-20000918.patch.gz

--17:30:18--  http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

>>> Downloading http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/distfiles/nc-v6-20000918.patch.gz

--17:30:18--  http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/distfiles/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

>>> Downloading ftp://sith.mimuw.edu.pl/pub/users/baggins/IPv6/nc-v6-20000918.patch.gz

--17:30:18--  ftp://sith.mimuw.edu.pl/pub/users/baggins/IPv6/nc-v6-20000918.patch.gz

           => `/usr/portage/distfiles/nc-v6-20000918.patch.gz'

Auflösen des Hostnamen »sith.mimuw.edu.pl«.... 193.0.96.4

Verbindungsaufbau zu sith.mimuw.edu.pl[193.0.96.4]:21... verbunden.

Anmelden als anonymous ... Angemeldet!

==> SYST ... fertig.    ==> PWD ... fertig.

==> TYPE I ... fertig.  ==> CWD /pub/users/baggins/IPv6 ... fertig.

==> PASV ... fertig.    ==> RETR nc-v6-20000918.patch.gz ... fertig.

Länge: 8,740 (unmaßgeblich)

100%[====================================>] 8,740         --.--K/s             

17:30:21 (80.55 KB/s) - »/usr/portage/distfiles/nc-v6-20000918.patch.gz« gespeichert [8740]

>>> Downloading http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror//distfiles/nc110.tgz

--17:30:21--  http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc110.tgz

--17:30:21--  http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://ftp.uni-erlangen.de/pub/mirrors/gentoo/distfiles/nc110.tgz

--17:30:21--  http://ftp.uni-erlangen.de/pub/mirrors/gentoo/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/distfiles/nc110.tgz

--17:30:21--  http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/distfiles/nc110.tgz

--17:30:21--  http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/distfiles/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

>>> Downloading http://www.atstake.com/research/tools/network_utilities/nc110.tgz

--17:30:21--  http://www.atstake.com/research/tools/network_utilities/nc110.tgz

           => `/usr/portage/distfiles/nc110.tgz'

!!! Couldn't download nc110.tgz. Aborting.

bad client ~ # 

```

 *Quote:*   

> I've looked again at your error's and they stop at the same place - name/IP resolution... 

 

yes, i've noticed that too. on my good box the output matches yours...connecting to the proxy and all that...but the bad box doesn't give a damn about the proxy for some strange reason...   :Confused: 

 *Quote:*   

> What do you get when you try to ping the proxy?
> 
> ```
> ping -c 3 192.168.4.10
> ```
> ...

 

looks all normal to me...

```
bad client ~ # ping -c 3 192.168.4.10

PING 192.168.4.10 (192.168.4.10) 56(84) bytes of data.

64 bytes from 192.168.4.10: icmp_seq=1 ttl=64 time=0.199 ms

64 bytes from 192.168.4.10: icmp_seq=2 ttl=64 time=0.182 ms

64 bytes from 192.168.4.10: icmp_seq=3 ttl=64 time=0.188 ms

--- 192.168.4.10 ping statistics ---

3 packets transmitted, 3 received, 0% packet loss, time 1999ms

rtt min/avg/max/mdev = 0.182/0.189/0.199/0.017 ms

```

 *Quote:*   

> Show the route of the good box and the bad box - you must be root to do this...
> 
> ```
> route
> ```
> ...

 

the good box:

```
good client root # route

Kernel IP Routentabelle

Ziel            Router          Genmask         Flags Metric Ref    Use Iface

192.168.4.0     *               255.255.255.0   U     0      0        0 eth0

loopback        localhost       255.0.0.0       UG    0      0        0 lo

default         doobcop         0.0.0.0         UG    0      0        0 eth0

```

the bad box:

```
bad client ~ # route

Kernel IP Routentabelle

Ziel            Router          Genmask         Flags Metric Ref    Use Iface

192.168.4.0     *               255.255.255.0   U     0      0        0 eth0

loopback        localhost       255.0.0.0       UG    0      0        0 lo

default         doobcop         0.0.0.0         UG    0      0        0 eth0

```

they both look the same. one thing confusing me a little bit is that IP '192.168.4.0'....i ain't got a device on my network with that IP!? but if that would be my problem, it wouldn't work on both boxen, no?

 *Quote:*   

> I assume you have a normal lan? All the boxes are plugged into a plain router?

 

yes, all boxen are connected to the same switch (unmanaged...runs several years without a problem... didn't unplug or even touch the cables lately)

no firewall or similar between them two boxen.

----------

## flybynite

Haven't found it yet, but let's move this off list.  PM me the following.

on the server and the bad box:

```

iptables -L

```

and on the bad box:

```

cat ~/.wgetrc

cat .wgetrc

cat /etc/wget/wgetrc

```

and check the router config.  Just to make sure you don't have any firewall/forwarding setup in there.

----------

## Bob Shroom

 *Quote:*   

> PM me the following.

 

alright, check your inbox.

----------

## cdunham

I have been seeing this, both with 1.3 and 1.5. I'm guessing I have a borked file or something:

```
# repcacheman

Replicator's cache directory: /var/cache/http-replicator/

Portage's DISTDIR: /usr/portage/distfiles/

Comparing directories....

Done!

Deleting duplicate file(s) in /usr/portage/distfiles/

http-replicator-2.1_rc3.tar.gz

Done!

New files in DISTDIR:

libsoup-2.2.1.tar.bz2

.locks

evolution-data-server-1.0.2.tar.bz2

gal-2.2.3.tar.bz2

evolution-2.0.2.tar.bz2

Checking authenticity and integrity of new files...

Searching for ebuilds's ....

Done!

Found 16331 ebuilds.

Extracting the checksums....

Missing digest: dev-perl/Net-SNMP-4.1.2

Done!

Verifying checksum's....

/usr/portage/distfiles/libsoup-2.2.1.tar.bz2

Traceback (most recent call last):

  File "/usr/bin/repcacheman", line 204, in ?

    if t[0]:

KeyError: 0
```

Anyone else seeing this? A known problem? Everything seems to be working great, but this error keeps happening...

Thanks!

----------

## flybynite

 *cdunham wrote:*   

> I have been seeing this, both with 1.3 and 1.5. I'm guessing I have a borked file or something:
> 
> New files in DISTDIR:
> 
> libsoup-2.2.1.tar.bz2
> ...

 

This is a quick guess because of time, but I don't know of any reason why (.) hidden files should be in distfiles dir.  

Delete the dot file and try again.

```

rm /usr/portage/distfiles/.locks

```

Have you been sharing the distfile dir or running some other program that would leave a ".locks" file there?  That certainly isn't a package portage would download!

If this is the fix I'll add some code in the next version of repcacheman to allow for this, but I don't know how that file got there.....

----------

## cdunham

Actually, it's a directory, and seems to be on every Gentoo machine I looked at.

In any case, removing it didn't help. I'm not big on python. What is t[0] supposed to have at that point, and what does the error message actually mean?

Thanks for taking a look!

----------

## flybynite

 *cdunham wrote:*   

> Actually, it's a directory, and seems to be on every Gentoo machine I looked at.
> 
> In any case, removing it didn't help. I'm not big on python. What is t[0] supposed to have at that point, and what does the error message actually mean?
> 
> Thanks for taking a look!

 

First, the .locks dir isn't on any gentoo machine I have.  I still suspect that you are currently or have in the past  shared the distfile dir using samba or nfs.  I'm curious because it may be something I need to account for in later versions of repcacheman. 

But it doesn't matter in this case, the problem seems to be the previous file.  Repcacheman skips dir's.  I couldn't tell .locks was a dir from your post  :Smile: 

I believe the problem is in portage.  Could be their error or mine I don't know yet.  Sometimes dev's forget something in the portage tree and later correct it.  An example is that someone probably forgot to check in the missing Net-SNMP MD5 shown in your repcacheman output:

```

Extracting the checksums....

Missing digest: dev-perl/Net-SNMP-4.1.2

Done! 

```

When I run repcacheman I don't see that missing md5 so someone added it after you last sync'd!

Repcacheman uses portage functions to get the md5 of the new files.  t[0] should be that md5sum. 

Please show me the following:

```

cat  /usr/portage/net-libs/libsoup/files/digest-libsoup-2.2.1

ls -l /usr/portage/distfiles/libsoup-2.2.1.tar.bz2

md5sum /usr/portage/distfiles/libsoup-2.2.1.tar.bz2

```

Then :

```

emerge sync

cat  /usr/portage/net-libs/libsoup/files/digest-libsoup-2.2.1

```

The idea here is to check the md5 listed in portage and the actual md5 of the problem file, sync and see if it changes.

Then run repcacheman again to see if you still get the same error.

If you do, and the md5's do match, then:

```

mv /usr/portage/distfiles/libsoup-2.2.1.tar.bz2  /var/cache/http-replicator/ 

```

This will fail if /usr and /var are on different partitions, but I wanted to know if they were  :Smile:   Just copy the file to replicator's cache, delete the /distifle copy and try repcacheman again.

----------

## cdunham

 *flybynite wrote:*   

> Please show me the following:
> 
> ```
> 
> cat  /usr/portage/net-libs/libsoup/files/digest-libsoup-2.2.1
> ...

 

```
% cat  /usr/portage/net-libs/libsoup/files/digest-libsoup-2.2.1

MD5 8132b0bce469affed688c4863702aa41 libsoup-2.2.1.tar.bz2 403907

% ls -l /usr/portage/distfiles/libsoup-2.2.1.tar.bz2

-rw-rw-r--  1 root portage 403907 Oct 16 17:15 /usr/portage/distfiles/libsoup-2.2.1.tar.bz2

% md5sum /usr/portage/distfiles/libsoup-2.2.1.tar.bz2

8132b0bce469affed688c4863702aa41  /usr/portage/distfiles/libsoup-2.2.1.tar.bz2
```

 *Quote:*   

> 
> 
> Then :
> 
> ```
> ...

 

```
% cat  /usr/portage/net-libs/libsoup/files/digest-libsoup-2.2.1

MD5 8132b0bce469affed688c4863702aa41 libsoup-2.2.1.tar.bz2 403907

```

 *Quote:*   

> Then run repcacheman again to see if you still get the same error.

 

Yup.

 *Quote:*   

> If you do, and the md5's do match, then:
> 
> ```
> 
> mv /usr/portage/distfiles/libsoup-2.2.1.tar.bz2  /var/cache/http-replicator/ 
> ...

 

They are on the same partition, so mv worked, but now it seems to have moved on:

```
# repcacheman

Replicator's cache directory: /var/cache/http-replicator/

Portage's DISTDIR: /usr/portage/distfiles/

Comparing directories....

Done!

New files in DISTDIR:

.locks

evolution-data-server-1.0.2.tar.bz2

evolution-2.0.2.tar.bz2

Checking authenticity and integrity of new files...

Searching for ebuilds's ....

Done!

Found 16359 ebuilds.

Extracting the checksums....

Missing digest: dev-perl/Net-SNMP-4.1.2

Done!

Verifying checksum's....

WARNING .locks is not in portage!!!

/usr/portage/distfiles/evolution-data-server-1.0.2.tar.bz2

Traceback (most recent call last):

  File "/usr/bin/repcacheman", line 204, in ?

    if t[0]:

KeyError: 0

```

Note that .locks has returned. This machine has never run samba or nfs. It seems to be coming from emerge --sync, but I'll work more on verifying that.

Hope this helps!

----------

## cdunham

Just a little more information. I tracked down the problem with:

```
Extracting the checksums....

Missing digest: dev-perl/Net-SNMP-4.1.2

Done! 
```

which was a redundant (and un-Manifested) ebuild in PORTAGE_OVERLAY. removing it cleared up this message, but not the ultimate problem.

However, changing t[0] to t['MD5'] did seem to fix things...

----------

## Bob Shroom

@flybynite:

just wanted to let you know, that re-emerging wget did the trick for me.

now the bad box connects to the proxy as it should.

thanks for your help and time... and keep up the good work.  :Smile: 

.bob

----------

## flybynite

cdunham,

Ok, I guess the mirror you sync to must be using some program that leaves the .locks dir.  Do you sync to a specific mirror or rotation?  I would like to sync there myself to see what else might be in there...

It Seems there is nothing special about your libsoup file error.  It's just failing on the first file.....

 *cdunham wrote:*   

> Just a little more information....
> 
> changing t[0] to t['MD5'] did seem to fix things...

 

Ah yes, an always welcome API change in unstable portage  :Smile:   Shame on me for not asking which portage version your using, you obviously run ~

Relavent part of diff between stable 2.0.50-r11 and unstable portage-2.0.51-rc9

```

< if (md5_list[mybn][0] != md5sums[mybn][0])

---

> if (md5_list[mybn]["MD5"]  != md5sums[mybn]["MD5"])

```

So,  please edit repcacheman and change t[0] to t["MD5"] to confirm that this works.

If so I'll update http-replicator's ebuild to include an unstable version for unstable portage.

cdunham, thanks for the help fixing this!!Last edited by flybynite on Fri Oct 29, 2004 7:56 pm; edited 1 time in total

----------

## cdunham

Hey, that seems to have done it.

Thanks, flybynite!

----------

## VinnieNZ

I've just upgraded to the latest version of portage (2.0.51-r2) and when running repcacheman I get the following error:

 *Quote:*   

> *SNIP*
> 
> Extracting the checksums....
> 
> Done!
> ...

 

Are there any fixes for this?

*EDIT  Just read the post 2 above this one and realised that this looks like the likely solution.  Problem is that I don't have any lines that look like the first set in my /usr/bin/repcacheman file.Last edited by VinnieNZ on Wed Apr 25, 2007 11:31 pm; edited 1 time in total

----------

## JohnHerdy

 *flybynite wrote:*   

> If so I'll update http-replicator's ebuild to include an unstable version for unstable portage.

 

Hi flybynite,

We are using http-replicator on our network and it just works great. I want to thank you for the time and effort you put in this great program. Maybe the readers on this thread need to give more feedback on the bug-report so that this great program is added to the tree.

Portage 2.0.51 is marked stable so the current version doesn't work anymore. Fortunately you already have found a solution to this problem. Would you please let us know when you have a new ebuild ready.

<shameless plug> If it possible I would like to make a feature request; provide an entry in the log to see which IP-adresses have used the http-replicator for what file</shameless plug>

----------

## kevquinn

Don't know if it's just me, but when I upgraded to portage 2.0.51 (since it's now gone stable), my cache directory was automatically deleted, presumably by portage  :Sad: 

I'd placed it at /usr/portage/distcache.  My /var partition is 8Gb in size, and the replicator cache got rather large, enough to cause larger ebuilds to bomb due to lack of space in /var/tmp/portage.  I've now changed it to /usr/cache/distfiles.  Of course, now I've lost all my downloads - I knew I should have backed it up to a DVD-R!

So the warning is, don't put your cache directory in /usr/portage!  I guess it was a stupid thing to do anyway, in hindsight...

----------

## Bob Paddock

I'm having trouble getting http-replicator to work consistently.  I can fetch from my laptop to get files from the desktop machine if the package is in the cache.  However if the package

is not in the cache then the desktop machine does not go out on Internet and get it, and the

laptop just says "waiting..." forever.

I suspect I have a mixture of versions for example, from page four of this thread:

"Edit /etc/conf.d/http-replicator to add your external proxy and check other defaults."

but I find it as /etc/http-replicator.

Are there *CURRENT* install instructions for the *LATESTS* version?  I've tried to

piece that together from the long thread here, without success.

Is there an ebuild that works for the *LATESTS* version?

----------

## flybynite

 *Bob Paddock wrote:*   

> 
> 
> Are there *CURRENT* install instructions for the *LATESTS* version? 

 

How are you defining current?  I defined STABLE and UNSTABLE.  Each version has it own instructions.

The howto at the start of this thread is the "stable" version and it ONLY has a download link for http://www.updatedlinux.com/replicator/http-replicator-flybynite-1.3.tar.bz2

and complete instructions.

The "unstable" version post on page 4 ( https://forums.gentoo.org/viewtopic.php?t=173226&start=76 ) is the only place you can find a download link for http://www.updatedlinux.com/replicator/http-replicator-flybynite-1.5.tar.bz2 and says this which I thought would be clear:

 *Quote:*   

> 
> 
> New version ready for testing!! Although the init code,logging, and options are cleaned up, this version primarily adds external proxy support.
> 
> Current users not needing external proxy support should wait for the next stable release. Experienced users can help test.
> ...

 

If that doesn't make sense to you then 

```

etcat -v http-replicator

```

Which will show you your version.  

```

*  net-misc/http-replicator :

        [   ] 2.0-r2 (0) OVERLAY

        [  I] 2.1_rc3 (0) OVERLAY

```

If you have 2.0-r2 then read only the HOWTO

If you have 2.1-rc3 then read the HOWTO PLUS the changes in the "unstable version" post at https://forums.gentoo.org/viewtopic.php?t=173226&start=76

----------

## flybynite

 *kevquinn wrote:*   

> Don't know if it's just me, but when I upgraded to portage 2.0.51 (since it's now gone stable), my cache directory was automatically deleted, presumably by portage 
> 
> 

 

I confirmed this.  Portage actually deleted /usr/portage/distcache when you ran emerge sync.

It makes sense, really.   "emerge sync" makes /usr/portage on your box the same as the /usr/portage on the server.  Since /usr/portage/distcache doesn't exist on the server, rsync deleted it the same way it would delete an outdated package.

Just to make this clear to all users, this had nothing to do with http-replicator, just a normal function of an emerge sync.  The lesson is portage owns /usr/portage and don't put anything there.

----------

## acaito_con

 *cdunham wrote:*   

> 
> 
> However, changing t[0] to t['MD5'] did seem to fix things...

 

I did have to change this reference on both line 198 and 201, may help others as well.

Glad i came across this post, i'd already resync'd, updated python. and viewed the source for obvious errors, wasnt famliar w/ the MD5 lib.

glad this works, great tool

btw. you may want to stop by the gentoo-wiki might need some slight update

----------

## flybynite

 *JohnHerdy wrote:*   

> 
> 
> <shameless plug> If it possible I would like to make a feature request; provide an entry in the log to see which IP-adresses have used the http-replicator for what file</shameless plug>

 

Already included in 2.1_rc3 , Posted: Wed Aug 04 2004! at https://forums.gentoo.org/viewtopic.php?t=173226&start=76

It's part of the stats package that is coming.  You may need to parse a couple lines to get the complete info you want.  But you probably just need to wait for that stats package in the next version of http-replicator.

Here is just part of what is logged in that version:

```

24 Oct 2004 19:09:23 DEBUG: HttpClient 161 connected to 192.168.2.123:34224

24 Oct 2004 19:09:23 INFO: HttpClient 161 received request for http://gentoo.osuosl.org/distfiles/prelink-20040707.tar.bz2

24 Oct 2004 19:09:23 STAT: HttpServer 161 serving 902324 bytes from cache to 192.168.2.123

```

----------

## JohnHerdy

 *flybynite wrote:*   

>  *JohnHerdy wrote:*   
> 
> <shameless plug> If it possible I would like to make a feature request; provide an entry in the log to see which IP-adresses have used the http-replicator for what file</shameless plug> 
> 
> Already included in 2.1_rc3

 

Man I love you!!! (or at least your work). Sorry for asking a stupid question, but I couldn't find it on your website; is the latest version available on your website compliant with 2.0.51-clients?

----------

## kmarasco

I added 

```
RESUMECOMMAND="/usr/bin/wget -t 5 --passive-ftp \${URI} -O \${DISTDIR}/\${FILE}" 
```

to my make,conf, and http replicator worked fine. However, I am serving binary packages from Apache from the same server, and my emerge -gK fails with that setting.

```
kristel root # emerge -gK beaver

Fetching binary packages info...

Loaded metadata pickle.

cache miss: 'x' --- cache hit: 'o'

xoo

  -- DONE!

Calculating dependencies ...done!

>>> emerge (1 of 1) app-editors/beaver-0.2.6 to /

Fetching 'app-editors/beaver-0.2.6'

--01:07:23--  http://192.168.2.2/portage/packages/All/beaver-0.2.6.tbz2

           => `/usr/portage/packages/All//${FILE}'

Connecting to 192.168.2.2:80... connected.

HTTP request sent, awaiting response... 200 OK

Length: 144,983 [text/plain]

100%[=================================================================================>] 144,983       --.--K/s

01:07:23 (1.02 MB/s) - `/usr/portage/packages/All//${FILE}' saved [144983/144983]

!!! CATEGORY info missing from info chunk, aborting...

kristel root #
```

Note that it actually uses the variable name as the name of the file, when trying to save the binary to the package directory.

If I use this syntax for the resume command:

```
RESUMECOMMAND="/usr/bin/wget -t 5 --passive-ftp \${URI} -P \${DISTDIR}" 
```

Both http-replicator and my binary pull seem to work fine, but I am concerned that I may be creating a problem for http-replicator that I am unaware, and that may bite me later.

Is the second syntax ok with http-replicator? I'm using portage portage-2.0.51-r2.

----------

## flybynite

 *JohnHerdy wrote:*   

> 
> 
> Sorry for asking a stupid question, but I couldn't find it on your website; is the latest version available on your website compliant with 2.0.51-clients?

 

Sorry, the website is out of date, There are two versions out, both are only listed or linked in the howto at the start of this thread. 

All versions of http-replicator will work with the latest portage.  But, repcacheman will barf with an error on >=portage-2.0.51.  repcacheman errors will not affect the operation of http-replicator.

You can download a new repcacheman here:

http://www.updatedlinux.com/replicator/portagefix/repcacheman

You'll need to copy it to /usr/bin/repcacheman and chmod +x

Or, just edit 2 lines on your existing /usr/bin/repcacheman.  Change  t[0] to t["MD5"] 

I'll have the fix in the ebuild soon.....Last edited by flybynite on Fri Oct 29, 2004 7:53 pm; edited 1 time in total

----------

## JohnHerdy

 *flybynite wrote:*   

> All versions of http-replicator will work with the latest portage

 

Hi flybynite,

Unfortunately this is not the case. 100% of the 2.0.51-clients don't work, 100% of the 2.0.50-clients work. The moment I migrate a client to 2.0.51 http-replicator doesn't work anymore. In our configuration we use a proxy-server (with validation). It seems that something is going wrong with the connection to the proxy. Our proxy resolves the ip-address for the mirrors but when I use http-replicator on a 2.0.51-client the proxy always returns unknown host. When I do a emerge-webrsync on a 2.0.51-client the tarball is fetched from the http-replicator server.

Thanks,

John.

----------

## flybynite

 *kmarasco wrote:*   

> 
> 
> Is the second syntax ok with http-replicator? I'm using portage portage-2.0.51-r2.

 

It's going to be a long day  :Sad: 

This should fix it...

```

rm /usr/portage/packages/app-editors/beaver-0.2.6.tbz2

rm /usr/portage/packages/All/beaver-0.2.6.tbz2

```

Then try the replicator syntax again.  This will help me figure out where the problem is.

Http-replicator doesn't support resuming yet.  This syntax disables resuming and forces wget to overwrite the partial file.

----------

## flybynite

 *JohnHerdy wrote:*   

> 
> 
> Unfortunately this is not the case. 100% of the 2.0.51-clients don't work

 

2.0.51 works for me  :Smile: 

I need some output and logs before I can even guess at whats wrong....

----------

## JohnHerdy

Ok did some more research, it seems that the new version of portage doesn't respect my .wgetrc-settings anymore. After adding the following lines to /etc/make.conf 2.0.51-clients work as well;

http_proxy="http://YourMirrorHere.com:8080" 

RESUMECOMMAND=" /usr/bin/wget -t 5 --passive-ftp  \${URI} -O \${DISTDIR}/\${FILE}"

With 2.0.50 this wasn't neccesary the proxy-settings where read from .wgetrc. Hmmm, strange...

----------

## kmarasco

 *flybynite wrote:*   

> 
> 
> This should fix it...
> 
> ```
> ...

 

Below code after deleting the files from both the client and the server.

```
kristel root # emerge -gK beaver

Fetching binary packages info...

Loaded metadata pickle.

cache miss: 'x' --- cache hit: 'o'

ooo

  -- DONE!

Calculating dependencies ...done!

>>> emerge (1 of 1) app-editors/beaver-0.2.6 to /

Fetching 'app-editors/beaver-0.2.6'

--12:29:48--  http://192.168.2.2/portage/packages/All/beaver-0.2.6.tbz2

           => `/usr/portage/packages/All//${FILE}'

Connecting to 192.168.2.2:80... connected.

HTTP request sent, awaiting response... 404 Not Found

12:29:48 ERROR 404: Not Found.

Fetcher exited with a failure condition.

!!! CATEGORY info missing from info chunk, aborting...

kristel root #

```

If I recreate the binary package, I get back to the originally posted error.

Note: I had ethereal observe the traffic and the http-replicator proxy does not appear to get involved with the locally served binaries from apache. I'm a rookie at ethereal, so I may have missed something, but it appears that when merging binaries using the PORTAGE_BINHOST variable in make.conf that the http_proxy variable is ignored (only for emerge -gK. The proxy is definitely used when source packages are pulled.).

----------

## flybynite

 *kmarasco wrote:*   

> 
> 
> If I recreate the binary package, I get back to the originally posted error.
> 
> 

 

Thanks, that confirms some things for me.

You don't have FETCHCOMMAND set do you?

 *kmarasco wrote:*   

> 
> 
> Note: I had ethereal observe the traffic and the http-replicator proxy does not appear to get involved with the locally served binaries from apache. 
> 
> 

 

Http-replicator isn't involved at all in this transaction.  However RESUMECOMMAND seems to be and that is the problem.  I find it "interesting" that portage uses RESUMECOMMAND when there is nothing to resume, I've filed a portage bug to get more info.  

I use PORTAGE_BINHOST also, but with an ftp mirror.   If you need a workaround, use an FTP server as your BINHOST till I get this worked out.

----------

## kmarasco

 *flybynite wrote:*   

> 
> 
> I use PORTAGE_BINHOST also, but with an ftp mirror.   If you need a workaround, use an FTP server as your BINHOST till I get this worked out.

 

I set up vsftp and I have the same issue.

```

kristel root # emerge -gK beaver

Fetching binary packages info...

 * No password provided for username 'anonymous'

Loaded metadata pickle.

cache miss: 'x' --- cache hit: 'o'

oooo

  -- DONE!

Calculating dependencies ...done!

>>> emerge (1 of 1) app-editors/beaver-0.2.6 to /

Fetching 'app-editors/beaver-0.2.6'

--23:29:42--  ftp://192.168.2.2/All/beaver-0.2.6.tbz2

           => `/usr/portage/packages/All//${FILE}'

Connecting to 192.168.2.2:21... connected.

Logging in as anonymous ... Logged in!

==> SYST ... done.    ==> PWD ... done.

==> TYPE I ... done.  ==> CWD /All ... done.

==> PASV ... done.    ==> RETR beaver-0.2.6.tbz2 ... done.

Length: 144,983 (unauthoritative)

100%[=================================================================================>] 144,983      107.91K/s

23:29:43 (107.85 KB/s) - `/usr/portage/packages/All//${FILE}' saved [144983]

!!! CATEGORY info missing from info chunk, aborting...

kristel root # 
```

Hmmm, any more thoughts? Typo/poor attention to detail on my part?

```
RESUMECOMMAND="/usr/bin/wget -t 5 --passive-ftp \${URI} -O \${DISTDIR}/\${FILE}"
```

[/quote]

----------

## kmarasco

Thanks for your help flybynite. Since I experienced the bug with both ftp and http, I tried to find an alternative syntax for the resume command. I think that the following syntax should have the same effect, but does not require that the file name variable be passed in order to replace a partial file. It is as follows:

```
RESUMECOMMAND="/usr/bin/wget -N -t 5 --passive-ftp \${URI} -P \${DISTDIR}"
```

With the -N option, Only new files will be downloaded in the place of the old ones. Using -N, a file is considered new if one of these two conditions are met:

   1. A file of that name does not already exist locally.

   2. A file of that name does exist, but the remote file was modified more recently than the local file. 

More importantly, if the local file does not exist, or the sizes of the files do not match, Wget will download the remote file no matter what the time-stamps say.

----------

## flybynite

 *kmarasco wrote:*   

> I set up vsftp and I have the same issue.
> 
> Typo/poor attention to detail on my part?
> 
> 

 

Nope, my part, sorry.  I was using ftp without error on one of my boxes - because I had changed the RESUMECOMMAND so many times in testing that I left it wrong.  The portage bug affects both ftp and http.

 *kmarasco wrote:*   

> 
> 
> I tried to find an alternative syntax for the resume command. I think that the following syntax should have the same effect, but does not require that the file name variable be passed in order to replace a partial file.
> 
> 

 

I appreciate the help.  Wget interacting with portage and RESUMECOMMAND together had so many options finding the right combination is complex.  I wish the wget doc's were better....

The good news is that this problem should go away soon!!  

Thanks to you, I reported this bug and the portage dev's have deemed it worthy.   It has been fixed in CVS.  This means the next release of portage should contain the fix and then you can go back to using $FILE!!

Until then, your -N suggestion is probably the best fix.  It's not a perfect solution for http-replicator because of a couple side effects that limit http-replicator.  One such side effect is that with -N, an internet connection must exist.  With the original RESUMECOMMAND, http-replicator doesn't need an internet connection to serve from the cache.

----------

## matbintang

flybynite! 

You Da Man!!

 :Very Happy: 

----------

## kmarasco

Thanks again for all of your help. I commend you for your excellent work and ongoing support effort  :Exclamation: 

----------

## flybynite

Thanks ( smiles proudly)

I just hope you still feel that way when I say you need to change portage_util.writemsg back to portage.writemsg !!

Or download repcacheman ver 3.2 at http://www.updatedlinux.com/replicator/portagefix/repcacheman

----------

## BlinkEye

i do have a problem: repcacheman runs every 5 minutes but even though it should have nothing to do (i.e. i didn't download any packages and /usr/portage/distfiles/ is empy) it eats a lot of my precious memory (my server is an oldie):

```
  PID  PPID  UID USER     RUSER    TTY         TIME+  %CPU %MEM S COMMAND

4174  4163    0 root     root     ?          0:28.75 10.6 12.7 R repcacheman

```

how could i prevent that?

----------

## flybynite

The release of portage 2.0.51 has caused a number of problems.

These problems haven't affected how well http-replicator itself runs, only with the support script repcacheman.

I have updated ebuilds with changes to support portage 2.0.51.  I need help testing:

http://www.updatedlinux.com/replicator/http-replicator-flybynite-1.6a.tar.bz2

This release contains 5 versions:

*  net-misc/http-replicator :

        [   ] 2.0-r2 (0) OVERLAY

        [   ] 2.0-r3 (0) OVERLAY

        [   ] 2.1_rc3 (0) OVERLAY

        [   ] 2.1_rc3-r1 (0) OVERLAY

        [  I] 2.1 (0) OVERLAY

2.0-r2 and 2.1_rc3 DEPEND on portage 2.0.50 and won't work with the current stable portage 2.0.51  I left these in only for the 1 or 2 gentooers who don't like to upgrade  :Smile: 

2.0-r3 , 2.1_rc3_r1 and 2.1 will work with any current portage version.  2.1 primarily adds external proxy support and improved logging.

2.1 is the latest and is stable, but the changes aren't fully documented in the howto yet.

Follow the howto at the start of this thread for all versions.  Then If you upgraded to 2.1 -look at these changes:

**Note The config file location has changed in 2.1.  Edit /etc/conf.d/http-replicator to add your external proxy and check other defaults.

**Note repcacheman has changed in 2.1.  If you changed the default cache dir or the default user, you must call repcacheman with those options.  

```

repcacheman --user USER --dir /path/to/cache

```

Last edited by flybynite on Sat Oct 30, 2004 9:56 pm; edited 2 times in total

----------

## flybynite

 *BlinkEye wrote:*   

> 
> 
> how could i prevent that?

 

Fix is in the above post....

----------

## flybynite

I had a question about repcacheman and what it does and how often it should be run.

Repcacheman only needs to run after  emerge is run on the server ( except emerge sync)

Here is some more info on repcacheman:

 *Quote:*   

> 
> 
> Replicator can't share the same dir with portage because portage doesn't play nice.
> 
> This means only on the server, emerged packages exist in both replicator's cache and /distfiles.  The files in /distfiles are wasted space and dups.  One replicator function is to delete the dups in distfiles.
> ...

 

----------

## meowsqueak

I upgraded to 2.1 but I'm getting this problem on one of my LANs. The following spews out about 20 times:

```
...

Connecting to 10.16.10.224:8080... connected.

Proxy request sent, awaiting response... 301 Moved Permanently

Location: http://gentoo.osuosl.org/distfiles/fontconfig-2.2.3.tar.gz [following]

--17:30:43--  http://gentoo.osuosl.org/distfiles/fontconfig-2.2.3.tar.gz

           => `/usr/portage/distfiles/fontconfig-2.2.3.tar.gz'

Connecting to 10.16.10.224:8080... connected.

Proxy request sent, awaiting response... 301 Moved Permanently

Location: http://gentoo.osuosl.org/distfiles/fontconfig-2.2.3.tar.gz [following]

...

20 redirections exceeded.

!!! Couldn't download fontconfig-2.2.3.tar.gz. Aborting.
```

Funnily enough, it works fine on another LAN I look after. I'm using an external proxy too, but that IP address (10.16.10.224) is the address of my http-replicator server, and I'm running 'emerge -Duav world' on the same host, so it's trying to connect to itself and then failing. My conf file includes  '--ip 10.*.*.*'

----------

## meowsqueak

Ok, I did some reading into the 301 response:

http://www.checkupdown.com/status/E301.html

According to the output, wget first tries to download http://gentoo.oregonstate.edu/distfiles/fontconfig-2.2.3.tar.gz. The proxy request is sent and the response is a 301 error (presumably from the original server?). The Location specifies http://gentoo.osuosl.org/distfiles/fontconfig-2.2.3.tar.gz as the alternative URL, and wget follows this. But when it tries to download this, it gets another 301 with the location set to the exact same URL and the process repeats until the limit (20) is hit. So, is this a problem with this server in particular, or is it a problem with the http-replicator proxy? I suspect the server is fine because a manual wget of the osuosl.org file works fine (no followed location). Does http-replicator cache server responses or something?

----------

## flybynite

First, go ahead and update the link to http://gentoo.osuosl.org/ - since http://gentoo.oregonstate.edu/ is outdated anyway.  I've updated the howto.

The response isn't cached by http-replicator- might be cached by your ISP or somewhere else.

Then if you still have this problem, upgrade to http-replicator 2.1, set debug in the config and post the logs...

----------

## meowsqueak

Hmmm, I wonder if my squid proxy is caching something - here's the log:

```
04 Nov 2004 09:23:19 DEBUG: HttpClient 170 connected to 10.16.10.224:42290

04 Nov 2004 09:23:19 DEBUG: HttpClient 170 received header:

  GET http://gentoo.oregonstate.edu/distfiles/fontconfig-2.2.3.tar.gz HTTP/1.0

  User-Agent: Wget/1.9

  Host: gentoo.oregonstate.edu

  Accept: */*

04 Nov 2004 09:23:19 INFO: HttpClient 170 received request for http://gentoo.oregonstate.edu/distfiles/fontconfig-2.2.3.tar.gz

04 Nov 2004 09:23:19 DEBUG: HttpClient 170 connecting to gentoo.oregonstate.edu

04 Nov 2004 09:23:19 DEBUG: HttpServer 170 connected to 10.10.130.123:3128

04 Nov 2004 09:23:19 DEBUG: HttpServer 170 received header:

  HTTP/1.0 301 Moved Permanently

  Date: Wed, 03 Nov 2004 20:23:19 GMT

  Server: Apache/2.0.52 (Debian GNU/Linux)

  Location: http://gentoo.osuosl.org/distfiles/fontconfig-2.2.3.tar.gz

  Content-Length: 266

  Content-Type: text/html; charset=iso-8859-1

  X-Cache: MISS from squid-proxy

  Proxy-Connection: close

04 Nov 2004 09:23:19 DEBUG: HttpClient 170 closed

```

Why is the first GET still trying http://gentoo.oregonstate.edu ? I don't have that in make.conf any more. Emerge says 

```
>>> Downloading http://gentoo.oregonstate.edu/distfiles/fontconfig-2.2.3.tar.gz
```

Is http://gentoo.mirrors.pair.com providing me with this perhaps? I tried:

```
$ GENTOO_MIRRORS="http://gentoo.osuosl.org" sudo emerge -a --oneshot fontconfig

...

>>> Downloading http://gentoo.oregonstate.edu/distfiles/fontconfig-2.2.3.tar.gz

```

Not sure I understand what is going on here...

----------

## flybynite

 *meowsqueak wrote:*   

> Hmmm, I wonder if my squid proxy is caching something 
> 
> 

 

Squid's a likely culprit, but lets fix the link first......

 *meowsqueak wrote:*   

> 
> 
> Why is the first GET still trying http://gentoo.oregonstate.edu ? I don't have that in make.conf any more. 
> 
> Not sure I understand what is going on here...

 

If you followed the howto, you put gentoo.oregonstate.edu into your /etc/portage/mirrors.

That "local" mirror is downloaded before the make.conf mirrors.  I'd bet you just need to change the /etc/portage/mirrors....

----------

## meowsqueak

 *flybynite wrote:*   

> That "local" mirror is downloaded before the make.conf mirrors.  I'd bet you just need to change the /etc/portage/mirrors....

 

Bingo! I completely forgot about that file. Thanks for the help, I'm sorted now.

----------

## Maxwell

Well, i suppose i must be doing something stupid, but i need some help.

I installed http_replicator as i should, but as i have an http proxy port=8080, i switched replicator's port to 5000. But it looks like http_replicator isn't used by my notebook, in my internal lan. the value "http_proxy" in make.conf is the ip and door of the server, correct?

Both my /usr/portage/distfiles and my /var/log/http-replicator are empty. The log file also doesn't report anything when i emerge a file at the server.

Suggestions?

----------

## meowsqueak

 *Maxwell wrote:*   

> Both my /usr/portage/distfiles and my /var/log/http-replicator are empty. The log file also doesn't report anything when i emerge a file at the server.

 

/var/cache/http-replicator?

Did you start the http-replicator daemon?

```
 # /etc/init.d/http-replicator start
```

----------

## Maxwell

yes, i did start it

The output of lsof -i is that it is listening at port 5000.

But it doesn't get connected by any client...

----------

## flybynite

First, I need to see the client's emerge output.  Copy the output showing where the file is downloaded.

----------

## Maxwell

Ok

In a client, when i emerge something, the download part output is:

 *Quote:*   

> 
> 
> emerge (1 of 1) mail-client/mozilla-thunderbird-0.9 to /
> 
> >>> Downloading http://darkstar.ist.utl.pt/gentoo/distfiles/thunderbird-0.9-source.tar.bz2
> ...

 

The client's make.conf part is:

 *Quote:*   

> 
> 
> # Default fetch command (5 tries, passive ftp for firewall compatibility)
> 
> HTTP_PROXY="http://x.x.x.y:5000"
> ...

 

In the server i have the following in make.conf

 *Quote:*   

> 
> 
> # PORTDIR_OVERLAY is a directory where local ebuilds may be stored without
> 
> #     concern that they will be deleted by rsync updates. Default is not
> ...

 

The proxy setting that appears when i emerge something in the client is defined in /etc/profile.

Can anyone find a problem?

Help would be much apreciated

----------

## flybynite

 *Maxwell wrote:*   

> 
> 
> Connecting to proxy.d[x.x.x.x]:3128... connected.
> 
> 

 

This shows that the client portage is connecting to a proxy other than http-replicator.  You must have another proxy set somewhere..

 *Maxwell wrote:*   

> 
> 
> The client's make.conf part is:
> 
> HTTP_PROXY="http://x.x.x.y:5000"
> ...

 

This shows one reason why the other proxy is being used.  http_proxy should be in lower case.

After you lower case the client and server http_proxy try the emerge again.  If portage still uses the other proxy you need to find out where you have a proxy set to xxxx:3128.

This could be set in a couple of places.

 *Maxwell wrote:*   

> 
> 
> The proxy setting that appears when i emerge something in the client is defined in /etc/profile.
> 
> 

 

Ok already seem to know where the other proxy is being set.  It shouldn't surprise you that it is being used.  I guess you don't know how to unset it.

One of the files in /etc/env.d probably contains the xxx:3128 proxy definition.  Remove that definition.  Then you need to run 

```

env-update

source /etc/profile

```

You may need to completely log out and log back in again to reset all terminals.

Do you need to use the xxx:3128 proxy to get to the net? If you still need it to get to the net than put that proxy in /etc/conf.d/http-replicator.

----------

## Maxwell

Ok, big mistake!

http_proxy now is in lower case. Now i can connect to http_replicator!! (yes!)

But i need a proxy to connet to net. How i set http_replicator to use it? I've already set it in http_replicator.conf and it didn't fetch anything.

Thanks in advance

----------

## meowsqueak

In /etc/conf.d/http-replicator (not /etc/http-replicator.conf) you simply add a line that looks like this:

```
DAEMON_OPTS="$DAEMON_OPTS --external 192.168.0.1:3128"
```

I'm not sure how to deal with authentication, if required.

----------

## Maxwell

Now i've done it!

I was using an ultra old version. It didn't had /etc/conf.d/http_replicator.conf file. Now it works as it should. If i find any problems i will tell you guys!

Thank you all!!

----------

## johntramp

Hi.  I am trying to set up Http-Replicator with not much success. At the moment I am just trying with a client and a server. I am sure I have something wrong with the make.conf's as I was a little confused with the wording of the guide.  *Quote:*   

>  Add http_proxy="http://YourProxyHere.com:8080" Line replacing YourProxyHere.com with your proxy hostname or IP address.

  What does it mean by "YourProxyHere"   is that the address of the http-replicator server or my router?  Also do I have to change the address of the gentoo_mirrors on the client to be the server of the replicator?

Here are my 2 make.conf's,  first the server then the client.

 *Quote:*   

> CHOST="i686-pc-linux-gnu" 
> 
> CFLAGS="-O2 -march=pentium2 -fomit-frame-pointer -pipe"
> 
> CXXFLAGS="${CFLAGS}"
> ...

 

 *Quote:*   

> CFLAGS="-O3 -march=athlon-xp -pipe -fomit-frame-pointer"
> 
> CHOST="i686-pc-linux-gnu"
> 
> CXXFLAGS="${CFLAGS}"
> ...

 

If anyone can point me in the right direction it would be appreciated  :Smile: 

Thanks

----------

## meowsqueak

YourProxyHere.com:Port is the IP address or DNS name of your http-replicator server and the access port. You set this on all client machines that you want to use the replicator. You can set this on the http-replicator server too, so that it uses itself to download source.

Keep the mirrors as is, just make sure they are http mirrors. The client makes a request for a file from one of your mirrors via the http-replicator proxy which then downloads and caches the file (or retrieves it from the cache if it's already downloaded), as well as relaying the incoming file to the client.

I hope this makes more sense?

----------

## johntramp

so there is no difference between the server and client's make.conf?

----------

## JayBee

Not sure if this has been covered in the thread - too long to wade through, and if so, could it be put near the front?

When I start an emerge from a client machine, I get a 401 not found error (from memory - away from the box at the moment), and it tries from all the mirrors it has listed. It appears that http-replicator is requesting the file on the server and downloading it, but the client isn't waiting to receive from the server.

Any ideas as to how to solve this?

Cheers

----------

## meowsqueak

 *johntramp wrote:*   

> so there is no difference between the server and client's make.conf?

 

Indeed - no difference in respect to http-replicator. Consider the http-replicator daemon (or service, or server, or provider) to be independent from the host it actually runs on. A client (or a server) is a program rather than a host, it just happens to run on a host. So wget on the http-replicator machine is actually a client to a server on the same host. It's still client-server, just doesn't go through the external network interface.Last edited by meowsqueak on Tue Nov 16, 2004 8:06 pm; edited 1 time in total

----------

## johntramp

ah  :Smile:  I understand now, thanks meowsqueak.

Q: what will happen if I set this up but the computer with the daemon on is offline?  Will my computer just emerge as it would with out the replicator?   - also what will happen when it needs to use the ftp?

----------

## RaraRasputin

Hi,

why is it suggested to add osuosl.org to the mirrors file? if add it, all mirrors in my make.conf are ignored and all files are downloaded from osuosl.org, which is really slow.

----------

## johntramp

I think any http:// mirror will work

----------

## flybynite

 *GuruSwami wrote:*   

> 
> 
> When I start an emerge from a client machine, I get a 401 not found error (from memory - away from the box at the moment)
> 
> 

 

You must post the errors.  Start with some of the emerge output...

You most likely have a typo in your config...

----------

## VinnieNZ

I'm attempting to setup a cascading distfiles solution for work.

Basically we have our main network with smaller networks in different regions connected into our main network.  The link to the internet from the main network is fast, the internal link to the regional network is slowish.  Because the link to the regional network is slow, but with many clients at the end, there are large advantages to having a system like this working.

I have setup http-replicator on a server in one of the regions, and one on a server locally.

What I want to happen is when a client computer from the regions requests a package it looks at the regional server first.  If it doesn't have it, then the regional server should attempt to fetch the file from the main network server, and if the main network server doesn't have it, it should go out to the internet and fetch it.  But I want to have a copy of that package left in the cache at every point.

In an example:

A regional client computer requests a new package and none of our servers have the file, so the file should be sourced from the internet then left on our package cache on the main network, as well as on the regional server, and finally in the distfiles dir of the regional client originally requesting the file.

I like the look of using a system like this because it means that we don't have to serve off of a ftp or http style system, and it doesn't interfere with our normal http proxy.

Is what I'm trying to do possible via http-replicator, or should I be looking at something else?

Cheers.

----------

## flybynite

 *VinnieNZ wrote:*   

> I'm attempting to setup a cascading distfiles solution for work.
> 
> 

 

Done....

Http-Replicator has an option for an external proxy.  Set the regional http-replicator to use the proxy of the main http-replicator proxy.

```

## Do you need a proxy to reach the internet?

## This will forward requests to an external proxy server:

## Use one of the following, not both:

#DAEMON_OPTS="$DAEMON_OPTS --external somehost:1234"

#DAEMON_OPTS="$DAEMON_OPTS --external username:password@host:port"

```

Then it will work exactly as you requested!!

Although this hasn't been done yet,  I  don't see any reason why this won't work exactly as you requested.  The testing for the external proxy support was done using another instance of http-replicator....

----------

## VinnieNZ

Agreed, the above does exactly what I want it to.  FANTASTIC piece of work!   :Very Happy: 

For people wondering where to find the above, its in /etc/conf.d/http-replicator.

Cheers flybynite!

----------

## flybynite

Thanks,  once you get this setup, let me know how it works...

----------

## VinnieNZ

 *flybynite wrote:*   

> Thanks,  once you get this setup, let me know how it works...

 

Have got this working on a few servers now and it seems to work great for what we do  :Smile: 

----------

## flybynite

 *VinnieNZ wrote:*   

> 
> 
> Have got this working on a few servers now and it seems to work great for what we do 

 

Thanks for the report!

A new version of http-replicator is due out in the next few days.  It requires no changes to the RESUMECOMMAND and supports serving as a BINHOST for binary packages!

----------

## flybynite

Http-Replicator 3.0 is here!!

New features:

Resuming downloads is supported client side.  This means no changes to the RESUMECOMMAND in /etc/make.conf are required or desired!

Dir serving.  Http-replicator will serve a dir of your choosing.  One example is the /usr/portage/packages/All dir which holds portage's binary packages!!  Http-Replicator is your best option to fully support multiple gentoo boxes on a LAN of any size!!  Emerge or transfer binary packages to http-replicator and serve them to your LAN!!

Cache serving. Http-replicator will serve the cache dir.  Want to know what is in the cache or download a particular file from the cache?  Just point your browser at http-replicator! 

Repcacheman updates.  Repcacheman will automatically install http-replicator and md5 check your existing packages!  Uses same config as http-replicator for ease of use!

Smaller runtime memory footprint.  Lean and mean!

I need a few experienced users to upgrade and report before I update the HOWTO!  If you need the full HOWTO to upgrade then it will be out in a couple of days, just hang on ....

Download Latest.tar.bz2

and upgrade replicator to ver 3.0.

Comment out the old changes to RESUMECOMMAND in /etc/make.conf by adding the # before RESUMECOMMAND 

leave the line: http_proxy=yourbox.com:8080 

If you wish, check the other defaults in /etc/conf.d/http-replicator

Restart http-replicator:

/etc/init.d/http-replicator restart

Point your browser to http://localhost:8080/All to see your binary packages.  Set client /etc/make.conf PORTAGE_BINHOST=http://serverbox.com:8080/All 

to activate fetching binaries and then emerge -gK xxx to test binary package hosting.   See man make.conf for more info.

Point your browser to http://localhost:8080 to see your cache dir.

Try to interrupt an emerge fetch with control-c and then restart.  Portage should resume the download where it left off.

And then report your findings!!Last edited by flybynite on Sun Dec 19, 2004 8:47 pm; edited 2 times in total

----------

## jbpros

Before anything, I have to thank you flybynite as http-replicator is running for months on two LANs I manage and it saved lot of expensive bandwidth!  :Smile: 

Here is a "little" enhancement request: would it be possible to configure http-replicator  in a way that it uses an external command to fetch packages instead of its internal http client?

What I would like to do is to have a deltup system between http-replicator and the remote http servers. This could save yet more bandwidth.

The first issue I can think of is that the resume system would stop working. Maybe you could add two configuration directives: one containing the command to call for normal downloads and another for resumed downloads. It could go even further by adding a third directive to disable resuming (deltup cannot resume downloads AFAIK).

I don't know if it's feasable, realistic.. anyway i think that reducing trafic usage is always a good thing  :Smile: 

Again thank you!

----------

## dripton

flybynite:

The name of your archive is a bit misleading.  Maybe bump the "1.7" to "3.0"

After untarring your archive under my PORTAGE_OVERLAY directory (/usr/local/portage), "emerge http-replicator" died with 

"!!! Security Violation: A file exists that is not in the manifest.

!!! File: files/repcacheman"  But this turned out to be cruft from the previous http-replicator tarball, not present in the latest one, and easy to clean up by just removing that file.  It's probably a good idea to delete an existing $PORTAGE_OVERLAY/net-misc/http-replicator directory before expanding the new tarball.

Resuming broken downloads worked great for me.  I tested breaking and resuming both a file that existed in the cache, and a remote download, with no RESUMECOMMAND.

The cache directory is viewable with a web browser as advertised.

I haven't tested serving binary packages yet.

Thanks again for a very useful program.

----------

## flybynite

 *jbpros wrote:*   

> Before anything, I have to thank you flybynite as http-replicator is running for months on two LANs I manage and it saved lot of expensive bandwidth! 
> 
> 

 

Thanks!

 *jbpros wrote:*   

> 
> 
> Here is a "little" enhancement request: would it be possible to configure http-replicator  in a way that it uses an external command to fetch packages instead of its internal http client?
> 
> 

 

I've tried this before.  It seems no external client can pass the necessary headers, errors, and data in a way that is robust and useful.  I've even got to the point of trying to hack wget to work, but then if you have to hack it to get it to work why even use it?  What do you think this will gain?

 *jbpros wrote:*   

> 
> 
> What I would like to do is to have a deltup system between http-replicator and the remote http servers. This could save yet more bandwidth.
> 
> 

 

I'm not sure what exactly your trying to accomplish here.  But that seems backwards.  Can't deltup just retrieve the updates through replicator so you only have to download each update once?  Is that what your trying to achieve?

----------

## flybynite

 *dripton wrote:*   

> flybynite:
> 
> The name of your archive is a bit misleading.  Maybe bump the "1.7" to "3.0"
> 
> 

 

 I agree but won't change this yet.

Right now the archive contains http-replicator ebuilds versions 1,2, and the testing version 3.  Should I name it http-replicator-1and2andtestingversion3-ebuilds.tar.gz?  Thats why I just left it for now.

 *dripton wrote:*   

> 
> 
> After untarring your archive under my PORTAGE_OVERLAY directory (/usr/local/portage), "emerge http-replicator" died with 
> 
> "!!! Security Violation: A file exists that is not in the manifest.
> ...

 

Yes, I renamed a file for the first time since 1.0

 *dripton wrote:*   

> 
> 
>  It's probably a good idea to delete an existing $PORTAGE_OVERLAY/net-misc/http-replicator directory before expanding the new tarball.
> 
> 

 

Agreed. HOWTO updated....

Thanks for the report!!

----------

## jbpros

 *Quote:*   

> I'm not sure what exactly your trying to accomplish here.  But that seems backwards.  Can't deltup just retrieve the updates through replicator so you only have to download each update once?  Is that what your trying to achieve?

 

My idea was to reduce internet trafic by making http replicator use the deltup system trasnparently. Yes I could make portage use the deltup script and specify the replicator proxy in this script but that would not be transparent and depend on the requesting host instead of the replicator "server".

Practically when an host requests some package to http-replicator, replicator checks if the package is in cache, if yes, then it perfoms as usually. If not it would delegate the download process to deltup which could base the download on existing packages in http replicator cache. 

Theorically this is possible. Maybe this could be achieved by implementing the deltup system within http-replicator. 

The advantage of this solution is the transparency. It's quite a good point in a network containing several gentoo nodes. 

That's my idea. Do you think it is completly crazy, hard to do or sort of realistic?

----------

## kmarasco

One issue that I came across was with programs with fetch restrictions, such as sun-jdk. Even though the source packages are in the http-replicator cache, they are not seen by portage, and must be moved or copied back to distfiles in order for the emerge to move forward. 

I noticed this when doing an "emerge -eD world".

----------

## flybynite

 *kmarasco wrote:*   

> One issue that I came across was with programs with fetch restrictions, such as sun-jdk. Even though the source packages are in the http-replicator cache, they are not seen by portage, and must be moved or copied back to distfiles in order for the emerge to move forward. 
> 
> 

 

You are correct.

This is entirely a portage weakness, not at all an http-replicator issue.  I have a work around for some of portages goofiness.  

Make sure you have the /etc/portage/mirrors file portage work-around.  It was added to the howto, so it might have been added after you first installed if you're a long time user.

This won't cure all portage's ill's, just some.  It makes portage check replicator for RESTRICT=nomirror but not for the RESTRICT=nofetch for packages like sun's java (sun-jdk). 

Feel free to file a bug.  Portage should check a local mirror even for "no-fetch" packages in my opinion!!  Don't even mention http-replicator.  Just mention the fact that not checking the "local mirror" for all packages defeats the purpose of a local mirror!!

Here is the nomirror work around from the howto in case you or anyone else needs it:

 *Quote:*   

> 
> 
> Also, some packages in portage have a RESTRICT="nomirror" option which will prevent portage from checking replicator for those packages. The following will override this behavior. Create the file "/etc/portage/mirrors" containing:
> 
> ```
> ...

 

----------

## flybynite

 *jbpros wrote:*   

> 
> 
> Theorically this is possible. Maybe this could be achieved by implementing the deltup system within http-replicator. 
> 
> That's my idea. Do you think it is completly crazy, hard to do or sort of realistic?

 

Http-Replicator has more development in store before I would consider it mature.  Maybe after I consider it mature I would look into deltup again.  Until then it would seem your best option is to use them together but as separate programs...

----------

## johntramp

I am sorry if this has been asked before, but is it possible to use this from the beginning of the gentoo install process?

[edit]yes, It seems that installing it with --nodeps before bootstrap works, I will remerge it again later after bootstrap / emerge system.   :Smile:   [/edit]

----------

## gringo

many thanks for this great tool, was really helpful here and is running fine for two weeks now.

One question: how would you manage packages for several different archs ? 

I have a powerbook here and im emerging with ACCEPT_KEYWORDS="ppc" + repcacheman for ppc specific stuff to have them in cache, but maybe theres a better solution ...

TIA

----------

## flybynite

I'm not really sure what you need.  Replicator should work fine with different arch's.  Do you mean repcacheman - it works on different arch's...

----------

## gringo

sorry for my stupid question, i forgot that this app _is_ a proxy so any ppc specific packages will also be cached by the server ( after repcacheman of course) even if you download on other machine of your network.

many thanx again, great app, thats what what i was looking for  :Wink: 

cheers

----------

## woody77

Can anyone post a mirror to the downloads for this?  The original poster's account appears to be suspended (bandwidth utilization reasons?)?

Thanks,

Woody

----------

## flybynite

 *woody77 wrote:*   

> Can anyone post a mirror to the downloads for this?  The original poster's account appears to be suspended (bandwidth utilization reasons?)?
> 
> Thanks,
> 
> Woody

 

Sorry,  I decided to switch hosting providers.  The links are back up at a temporary location.

md5sum http-replicator-flybynite-1.7.tar.bz2:

f2ef1b7ef73aa6657122748b238947ff

md5sum http-replicator-flybynite-1.6a.tar.bz2:

e02ce85d45f3774b6a98c49507cf8279

----------

## hothead

Hi,

I've a problem when running repcacheman

```
Found 17950 ebuilds.

Extracting the checksums....

Missing digest: app-office/abiword-2.1.90

Missing digest: kde-base/kde-meta-3.3.2

Missing digest: kde-base/kdeaddons-meta-3.3.2

Missing digest: kde-base/kdeadmin-meta-3.3.2

Missing digest: kde-base/kdebase-3.1-r1

Missing digest: kde-base/kdebase-meta-3.3.2

Missing digest: kde-base/kdeedu-meta-3.3.2

Missing digest: kde-base/kdegames-meta-3.3.2

Missing digest: kde-base/kdegraphics-meta-3.3.2

Missing digest: kde-base/kdemultimedia-meta-3.3.2

Missing digest: kde-base/kdenetwork-meta-3.3.2

Missing digest: kde-base/kdepim-meta-3.3.2

Missing digest: kde-base/kdesdk-meta-3.3.2

Missing digest: kde-base/kdetoys-meta-3.3.2

Missing digest: kde-base/kdeutils-meta-3.3.2

Missing digest: kde-base/kdewebdev-meta-3.3.2

Missing digest: media-plugins/xmms-plugins-1.0.1

Missing digest: media-radio/drm-1.0.6

Missing digest: media-video/vdrplugin-mldonkey-0.0.4a

Missing digest: net-mail/mulberry-3.0.0_alpha5

Missing digest: net-mail/mulberry-3.0.0_beta9

Missing digest: net-news/klibido-0.2.0

Traceback (most recent call last):

  File "/usr/bin/repcacheman.py", line 162, in ?

    digestpath = os.path.dirname(digestpath)+"/files/digest-"+pv

  File "/usr/lib/python2.3/posixpath.py", line 119, in dirname

    return split(p)[0]

  File "/usr/lib/python2.3/posixpath.py", line 77, in split

    i = p.rfind('/') + 1

AttributeError: 'NoneType' object has no attribute 'rfind'

```

I've several overlays (gentoo-de, kde-metaebuilds, bmg-main, usr) beside the main gentoo tree - Maybe thats the reason. 

I can deal with the missing digests but I don't know what the last Traceback is about. Does anyone knows how to fix this?

Thanks.

hothead

----------

## flybynite

This is most likely caused by your overlays not conforming to the gentoo standard.

This:

```

ebuild /path/to/ebuild/overlay/my.ebuild digest

```

will create the proper digest.

The last error is probably because that overlay doesn't even have a files dir to contain the digest...

You should also take this time to upgrade to the latest version if you haven't!

----------

## hothead

I found out that the gentoo-de tree is causing this problem. 

Now I've another question:

Is it good to use the same directory for distfiles and replicator cache? When setting it to the same directory I noticed that repcacheman marked all my distfiles as dubs and deleted them.

hothead

----------

## flybynite

 *hothead wrote:*   

> 
> 
> Is it good to use the same directory for distfiles and replicator cache?
> 
> 

 

No.  repcacheman was created because portage doesn't work well with others.  This is a portage issue that may change in the future.  The recent portage update that included lock files is a step in the right direction.  Prior to this update, just running two emerge's could blow portage up.

Http-replicator requires a separate cache dir, repcacheman transfers file from portage's dir to the cache dir to ensure there is no wasted space.

What is different about the gentoo-de tree?  I am not at all familiar with it.

----------

## hothead

The gentoo-de tree has some more ebuilds that are not in the main portage tree. 

For example the whole vdr (plugins) ximian-openoffice etc.. 

You can view the cvs here:http://www.gentoo.de/viewcvs/gentoo-x86/

hothead

----------

## soulwarrior

Thanks for this great program   :Very Happy: 

I do have a question, does http-replicator delete obsolete files, which are no longer needed by any ebuild in the portage tree?

----------

## flybynite

Sorry for the delay, I've been out for a while..

I've made no attempt to try to purge outdated distfiles.  I've given the reasons why in an earilier post.   It's actually a harder problem that you think at first because nobody can agree what needs to be purged.

What I can offer is that http-replicator/repcacheman create no special worries and any of the current scripts you can find on this board can work with replicator.

Search on this board and you can find many scripts that purge/delete/trash distfiles based on no longer in portage/newer version available/not installed/etc....Last edited by flybynite on Sat Feb 05, 2005 1:31 am; edited 1 time in total

----------

## Trespasser

I have just installed the http-replicator on one virutal server that is running on af VMware ESX Server (one of 4 esx-servers).

The good thing here is that the hardware is the same all way round, so I have just updated the http-replicator-server and build the binary packages for everything new. After that I change the make.conf's of the other gentoo-server running on our ESX-servers and updated them in a snap.

I have about 15-20 (some testing going on here) Gentoo-servers running on ESX-server, so this can really make a complete Gentoo-install quick ( I know I can just copy the image file, but sometimes the disk-space in the file isn't enough).

This works like a charm I must say.

Again, Thnx.

----------

## flybynite

Thanks for the report Trespasser!!

Many people only know Gentoo as a compile from source disto.  They don't realize how easy it is to compile a custom version of a package and then use http-replicator to distribute the binary to all your other machines.

This is same way many people administrate a large number of boxes using other disto's.  They may use the binaries for the base system from distro "X", but they compile a custom version of Apache from source.  With gentoo, they can still use the binaries from the latest release, and use portage to compile apache exactly as they want it.  Then they can easily distribute the binary to other gentoo machines.  This is a similiar workflow to other disto's, only easier because you have a few thousand of other admins helping through portage.

Http-Replicator is the best way I know of to manage multiple gentoo boxes.  It multiplies the power of portage!  I hope many others will see the advantages and use http-replicator to administer multiple machines like your doing.

----------

## jopalm

I've been sucessfully using HTTP-Replicator on my six box lan for several months.  I'm very pleased with it's functionality and performance.  However now I'm attempting to move my cache server from one box to another and have run into a problem.

Everything appears to be correctly configured on both server and clients.  HTTP-Replicator is sucessfully started.  The trouble begins when a client wants to emerge a package and tries connecting to the server for the source file.  At this point the client states the connection is refused.  A look at the server log indicates that, upon connection, the server has stopped.

```
magnolia etc # tail /var/log/http-replicator.log

22 Jan 2005 22:36:49 INFO: HttpReplicator started

22 Jan 2005 22:37:06 STAT: HttpClient 1 bound to 192.168.0.4

22 Jan 2005 22:37:06 ERROR: HttpClient 1 caught an exception in __getattr__: '_socketobject' object has no attribute 'data'

magnolia etc #
```

For the time being, I've gone back to my original server, but I'd really like to move this to the other box, so any tips or help would be greatly appreciated!

A quick look through this document didn't turn up a similar problem.  (I apologize if it's there and I missed it...)

Regards,

-John

----------

## flybynite

For some reason, the file requested by the client is missing...

I would bet you have a typo in the make config of the client.  Post the FETCHCOMMAND your using.  You might have left a space or unbalanced quote when you changed it to the new server.

If you don't spot it, post some more info.  Your configs and the output of the emerge could also be helpful....

You can confirm some of your settings like this:

```

source /etc/make.conf

echo $FETCHCOMMAND

echo $RESUMECOMMAND

echo $http_proxy

```

Also which version of http-replicator are you using.  Have you upgraded to the latest?

----------

## jopalm

Thanks for the response!

I'm off-site at the moment and won't be able to get the specifics you inquired about until this evening (PST).

 *Quote:*   

> For some reason, the file requested by the client is missing... 
> 
> I would bet you have a typo in the make config of the client.   Post the FETCHCOMMAND your using. You might have left a space or unbalanced quote when you changed it to the new server. 

 

That was my original thinking, too, but now I'm not so sure.  First, all six boxes were (for several months) correctly connecting to and receiving files from my original host (ballard).  The only edit to five of them was to change the proxy from ballard to magnolia.  (This edit literally consisted of ballard^H^H^H^H^H^H^Hmagnolia)  Immediately all boxes failed in their connect.  Switch back to ballard and all is fine again.

As a further expirement I set up a temporary server on a third box (fremont).  All clients can connect to ballard or fremont, but all fail to connect to magnolia.

For these reasons, I suspect the magnolia host configuration.

 *Quote:*   

> You can confirm some of your settings like this: 
> 
> Code: 
> 
> source /etc/make.conf 
> ...

 

Will do this evening.  Although, if memory serves, didn't the latest configuration call for no special FETCH and RESUME and thus they would be commented out?

 *Quote:*   

> Also which version of http-replicator are you using. Have you upgraded to the latest?

 

Existing host ballard was upgraded from an earlier version.  New host magnolia and temporary host fremont were new installs of latest version.  I will verify I didn't somehow install an older version.

Thanks again for you assistance!

Regards,

-John

----------

## jopalm

OK, I believe I have collected the relevant data...

First, with the server set-up and running on magnolia (new install) we get this fail  mode during an emerge from ballard (actually, it's the same from any of the six clients)

```
ballard root # emerge --fetchonly eject

Calculating dependencies ...done!

>>> emerge (1 of 1) sys-apps/eject-2.0.13 to /

>>> Downloading http://distfiles.gentoo.org/distfiles/eject-2.0.13.tar.gz

--22:21:01--  http://distfiles.gentoo.org/distfiles/eject-2.0.13.tar.gz

           => `/usr/portage/distfiles/eject-2.0.13.tar.gz'

Resolving magnolia... 192.168.0.5

Connecting to magnolia[192.168.0.5]:8080... connected.

Proxy request sent, awaiting response...

22:21:01 ERROR -1: No data received.

>>> Downloading http://distro.ibiblio.org/pub/Linux/distributions/gentoo/distfiles/eject-2.0.13.tar.gz

--22:21:01--  http://distro.ibiblio.org/pub/Linux/distributions/gentoo/distfiles/eject-2.0.13.tar.gz

           => `/usr/portage/distfiles/eject-2.0.13.tar.gz'

Resolving magnolia... 192.168.0.5

Connecting to magnolia[192.168.0.5]:8080... failed: Connection refused.

>>> Downloading http://www.ibiblio.org/pub/Linux/utils/disk-management/eject-2.0.13.tar.gz

--22:21:01--  http://www.ibiblio.org/pub/Linux/utils/disk-management/eject-2.0.13.tar.gz

           => `/usr/portage/distfiles/eject-2.0.13.tar.gz'

Resolving magnolia... 192.168.0.5

Connecting to magnolia[192.168.0.5]:8080... failed: Connection refused.

>>> Downloading http://www.pobox.com/~tranter/eject-2.0.13.tar.gz

--22:21:01--  http://www.pobox.com/%7Etranter/eject-2.0.13.tar.gz

           => `/usr/portage/distfiles/eject-2.0.13.tar.gz'

Resolving magnolia... 192.168.0.5

Connecting to magnolia[192.168.0.5]:8080... failed: Connection refused.

!!! Couldn't download eject-2.0.13.tar.gz. Aborting.

!!! Fetch for /usr/portage/sys-apps/eject/eject-2.0.13.ebuild failed, continuing...

!!! Some fetch errors were encountered.  Please see above for details.

ballard root #
```

During this emerge, the server log on magnolia (with debug) reports...

```
25 Jan 2005 22:20:01 INFO: HttpReplicator started

25 Jan 2005 22:21:01 STAT: HttpClient 1 bound to 192.168.0.4

25 Jan 2005 22:21:01 ERROR: HttpClient 1 caught an exception, closing socket

Traceback (most recent call last):

  File "/usr/lib/python2.3/asyncore.py", line 69, in read

    obj.handle_read_event()

  File "/usr/lib/python2.3/asyncore.py", line 390, in handle_read_event

    self.handle_read()

  File "/usr/bin/http-replicator", line 156, in handle_read

    self.data.write(chunk) # append received data

  File "/usr/lib/python2.3/asyncore.py", line 365, in __getattr__

    return getattr(self.socket, attr)

AttributeError: '_socketobject' object has no attribute 'data'
```

The attempted emerge seems to have killed the server as evidenced by none running to stop...

```
magnolia root # /etc/init.d/http-replicator restart

 * Stopping Http-Replicator...

No http-replicator found running; none killed.                            [ ok ]

 * Starting Http-Replicator...                                            [ ok ]

magnolia root #
```

As a temporary experiment, I have also freshly installed a server on fremont.  Changing from magnolia to fremont (with no other changes in client) results in this emerge:

```
ballard root # emerge --fetchonly eject

Calculating dependencies ...done!

>>> emerge (1 of 1) sys-apps/eject-2.0.13 to /

>>> Downloading http://distfiles.gentoo.org/distfiles/eject-2.0.13.tar.gz

--22:23:08--  http://distfiles.gentoo.org/distfiles/eject-2.0.13.tar.gz

           => `/usr/portage/distfiles/eject-2.0.13.tar.gz'

Resolving fremont... 192.168.0.3

Connecting to fremont[192.168.0.3]:8085... connected.

Proxy request sent, awaiting response... 200 OK

Length: 59,504

100%[====================================>] 59,504        --.--K/s

22:23:08 (11.36 MB/s) - `/usr/portage/distfiles/eject-2.0.13.tar.gz' saved [59504/59504]

>>> eject-2.0.13.tar.gz size ;-)

>>> eject-2.0.13.tar.gz MD5 ;-)

>>> md5 src_uri ;-) eject-2.0.13.tar.gz

ballard root #
```

(oh yeah, I'm using port 8085 on fremont due to 8080 already being in use...  magnolia is using 8080)

Here is my make.conf for broken host magnolia...

```
magnolia root # cat /etc/make.conf

# These settings were set by the catalyst build script that automatically built this stage

# Please consult /etc/make.conf.example for a more detailed example

CC='gcc'

CXX='c++'

CFLAGS="-O2 -march=i686 -fomit-frame-pointer -pipe"

CHOST="i686-pc-linux-gnu"

CXXFLAGS="${CFLAGS}"

FEATURES="distcc"

GENTOO_MIRRORS="http://gentoo.osuosl.org"

MAKEOPTS="-j12"

PORTAGE_BINHOST=http://magnolia:8080/All

PORTDIR_OVERLAY="/usr/local/portage"

PORTAGE_TMPDIR="/var/tmp"

RSYNC_RETRIES=6

RSYNC_TIMEOUT=500

SYNC="rsync://rsync.gentoo.org/gentoo-portage"

#SYNC="rsync://magnolia/gentoo-portage"

USE="-X -alsa -gnome -gtk -kde"

# Default fetch command (5 tries, passive ftp for firewall compatibility)

http_proxy="http://magnolia:8080"

#FETCHCOMMAND="/usr/bin/wget -t 5 --passive-ftp \${URI} -P \${DISTDIR}"

#RESUMECOMMAND="/usr/bin/wget -t 5 --passive-ftp \${URI} -O \${DISTDIR}/\${FILE}"

magnolia root #
```

and predictabily...

```
magnolia root # source /etc/make.conf

magnolia root # echo $FETCHCOMMAND

magnolia root # echo $RESUMECOMMAND

magnolia root # echo $http_proxy

http://magnolia:8080

magnolia root #
```

Curious, eh?

Any thoughts appreciated...

Regards,

-John

----------

## flybynite

 *jopalm wrote:*   

> 
> 
> Will do this evening.  Although, if memory serves, didn't the latest configuration call for no special FETCH and RESUME and thus they would be commented out?
> 
> Existing host ballard was upgraded from an earlier version.  New host magnolia and temporary host fremont were new installs of latest version.  I will verify I didn't somehow install an older version.
> ...

 

No special FETCH is correct, but could have been a source of failure if set.

Lets find out whats different about the http-replicator boxes.

Portage version, http-replicator version, python version for a working box and the non-working box.

```

emerge -va http-replicator python portage

```

----------

## flybynite

 *jopalm wrote:*   

> OK, I believe I have collected the relevant data...
> 
> 

 

Ok, what about 

```

source /etc/make.conf

echo $FETCHCOMMAND

echo $RESUMECOMMAND

echo $http_proxy

```

from the client after it has failed?

----------

## jopalm

Ok, here's a look at magnolia - the non-functional server:

```
magnolia root # emerge -va http-replicator python portage

These are the packages that I would merge, in order:

Calculating dependencies ...done!

[ebuild   R   ] net-misc/http-replicator-3.0  19 kB [1]

[ebuild   R   ] dev-lang/python-2.3.4  -X +berkdb -bootstrap -build -debug -doc +gdbm +ipv6* +ncurses +readline +ssl -tcltk -ucs2 7,020 kB

[ebuild   R   ] sys-apps/portage-2.0.51-r14  -build -debug (-selinux) 0 kB

Total size of downloads: 7,039 kB

Portage overlays:

 [1] /usr/local/portage

Do you want me to merge these packages? [Yes/No] n

Quitting.

magnolia root #
```

And a peek at fremont, the temporary server that's working:

```
fremont root # emerge -va http-replicator python portage

These are the packages that I would merge, in order:

Calculating dependencies ...done!

[ebuild   R   ] net-misc/http-replicator-3.0  19 kB [1]

[ebuild   R   ] dev-lang/python-2.3.4  +X* -berkdb* -bootstrap -build -debug -doc -gdbm -ipv6* -ncurses* -readline* +ssl -tcltk -ucs2 7,020 kB

[ebuild   R   ] sys-apps/portage-2.0.51-r14  -build -debug (-selinux) 270 kB

Total size of downloads: 7,310 kB

Portage overlays:

 [1] /usr/local/portage

Do you want me to merge these packages? [Yes/No] n

Quitting.

fremont root #
```

And ballard, one of the clients  (note - this was/is my original production server.  It appears that in setting up magnolia, I removed my portage overlay on ballard...)

```
ballard root # emerge -va http-replicator python portage

These are the packages that I would merge, in order:

Calculating dependencies

emerge: there are no ebuilds to satisfy "http-replicator".

ballard root # ls -l /usr/local/portage

ls: /usr/local/portage: No such file or directory

ballard root # emerge -va python portage

These are the packages that I would merge, in order:

Calculating dependencies ...done!

[ebuild   R   ] dev-lang/python-2.3.4  +X* +berkdb -bootstrap -build -debug -doc +gdbm* +ipv6 +ncurses +readline +ssl -tcltk -ucs2 7,020 kB

[ebuild   R   ] sys-apps/portage-2.0.51-r14  -build -debug (-selinux) 270 kB

Total size of downloads: 7,291 kB

Do you want me to merge these packages? [Yes/No] n

Quitting.

ballard root #
```

Here's a peek at environment variables on server magnolia with http-replicator up and running:

```
magnolia root # ps -ef | grep http-replicator

portage   6739     1  0 06:46 pts/1    00:00:00 /usr/bin/python /usr/bin/http-re

plicator -s -f --pid /var/run/http-replicator.pid --daemon --dir /var/cache/http

-replicator --user portage --alias /usr/portage/packages/All:All --log /var/log/

http-replicator.log --debug --ip 192.168.0.* --port 8080

root      6750  6589  0 06:47 pts/1    00:00:00 grep http-replicator

magnolia root # ps -ef | grep http-replicator

portage   6739     1  0 06:46 pts/1    00:00:00 /usr/bin/python /usr/bin/http-replicator -s -f --pid /var/run/http-replicator.pid --daemon --dir /var/cache/http-replicator --user portage --alias /usr/portage/packages/All:All --log /var/log/http-replicator.log --debug --ip 192.168.0.* --port 8080

root      6758  6589  0 06:47 pts/1    00:00:00 grep http-replicator

magnolia root # source /etc/make.conf

magnolia root # echo $FETCHCOMMAND

magnolia root # echo $RESUMECOMMAND

magnolia root # echo $http_proxy

http://magnolia:8080

magnolia root #
```

And finally, after a failed emerge attemp from ballard, resulting in http-replicator crashing, here's magnolia variables again:

```
magnolia root # ps -ef | grep http-replicator

root      6766  6589  0 06:49 pts/1    00:00:00 grep http-replicator

magnolia root # source /etc/make.conf

magnolia root # echo $FETCHCOMMAND

magnolia root # echo $RESUMECOMMAND

magnolia root # echo $http_proxy

http://magnolia:8080

magnolia root #
```

If it's not functional, at least it's consistent!  :Wink: 

Regards,

-John

----------

## flybynite

Ok, latest replicator, and no unstable packages in use.

There is an update to portage, r15, it came out the same day as r14 which probably means it was buggy, might want to upgrade...

You missed the env on the CLIENT, not the failed server magnolia. My original theory is the env on the CLIENT is in error.

Also the /etc/conf.d/http-replicator on the working server and the failed server could also be helpful.

We'll get this, one step at a time...

----------

## fyreflyer

I just want to say thanks flybynite!

I found setup to be a breeze; http-replicator makes me feel  like a better netizen.   :Cool: 

----------

## jopalm

Oops, my mistake...   YOu asked about the client and I listed the server.

Here we go with the client variables before an emerge, a failed emerge and then an echo of hte variable right after.

```
ballard root # source /etc/make.conf

ballard root # echo $FETCHCOMMAND

ballard root # echo $RESUMECOMMAND

ballard root # echo $http_proxy

http://magnolia:8080

ballard root # emerge --fetchonly eject

Calculating dependencies ...done!

>>> emerge (1 of 1) sys-apps/eject-2.0.13 to /

>>> Downloading http://distfiles.gentoo.org/distfiles/eject-2.0.13.tar.gz

--00:19:25--  http://distfiles.gentoo.org/distfiles/eject-2.0.13.tar.gz

           => `/usr/portage/distfiles/eject-2.0.13.tar.gz'

Resolving magnolia... 192.168.0.5

Connecting to magnolia[192.168.0.5]:8080... connected.

Proxy request sent, awaiting response...

00:19:25 ERROR -1: No data received.

>>> Downloading http://distro.ibiblio.org/pub/Linux/distributions/gentoo/distfiles/eject-2.0.13.tar.gz

--00:19:25--  http://distro.ibiblio.org/pub/Linux/distributions/gentoo/distfiles/eject-2.0.13.tar.gz

           => `/usr/portage/distfiles/eject-2.0.13.tar.gz'

Resolving magnolia... 192.168.0.5

Connecting to magnolia[192.168.0.5]:8080... failed: Connection refused.

>>> Downloading http://www.ibiblio.org/pub/Linux/utils/disk-management/eject-2.0.13.tar.gz

--00:19:25--  http://www.ibiblio.org/pub/Linux/utils/disk-management/eject-2.0.13.tar.gz

           => `/usr/portage/distfiles/eject-2.0.13.tar.gz'

Resolving magnolia... 192.168.0.5

Connecting to magnolia[192.168.0.5]:8080... failed: Connection refused.

>>> Downloading http://www.pobox.com/~tranter/eject-2.0.13.tar.gz

--00:19:25--  http://www.pobox.com/%7Etranter/eject-2.0.13.tar.gz

           => `/usr/portage/distfiles/eject-2.0.13.tar.gz'

Resolving magnolia... 192.168.0.5

Connecting to magnolia[192.168.0.5]:8080... failed: Connection refused.

!!! Couldn't download eject-2.0.13.tar.gz. Aborting.

!!! Fetch for /usr/portage/sys-apps/eject/eject-2.0.13.ebuild failed, continuing...

!!! Some fetch errors were encountered.  Please see above for details.

ballard root # source /etc/make.conf

ballard root # echo $FETCHCOMMAND

ballard root # echo $RESUMECOMMAND

ballard root # echo $http_proxy

http://magnolia:8080

ballard root #
```

I ran the same steps on two other clients (fremont and wedgwood) with the exact same results but didn't post them for space considerations.

I've got to admit - I'm baffled and scratching my head on this one so I appreciate all of you suggestions and help!

Regards,

-John

----------

## jopalm

An here are the conf.d files from the working server fremont and the failed server magnolia:

```
fremont conf.d # cat  http-replicator

## Config file for http-replicator

## sourced by init scripts automatically

## GENERAL_OPTS used by repcacheman

## DAEMON_OPTS used by http-replicator

## Set the cache dir

GENERAL_OPTS="--dir /var/cache/http-replicator"

## Change UID/GID to user after opening the log and pid file.

## 'user' must have read/write access to cache dir:

GENERAL_OPTS="$GENERAL_OPTS --user portage"

## Don't change or comment this out:

DAEMON_OPTS="$GENERAL_OPTS"

## Do you need a proxy to reach the internet?

## This will forward requests to an external proxy server:

## Use one of the following, not both:

#DAEMON_OPTS="$DAEMON_OPTS --external somehost:1234"

#DAEMON_OPTS="$DAEMON_OPTS --external username:password@host:port"

## Local dir to serve clients.  Great for serving binary packages

## See PKDIR and PORTAGE_BINHOST settings in 'man make.conf'

## --alias /path/to/serve:location will make /path/to/serve

## browsable at http://http-replicator.com:port/location

DAEMON_OPTS="$DAEMON_OPTS --alias /usr/portage/packages/All:All"

## Dir to hold the log file:

DAEMON_OPTS="$DAEMON_OPTS --log /var/log/http-replicator.log"

## Make the log messages less and less verbose.

## Up to four times to make it extremely quiet.

#DAEMON_OPTS="$DAEMON_OPTS --quiet"

#DAEMON_OPTS="$DAEMON_OPTS --quiet"

## Make the log messages extra verbose for debugging.

#DAEMON_OPTS="$DAEMON_OPTS --debug"

## The ip addresses from which access is allowed. Can be used as many times

## as necessary. Access from localhost is allowed by default.

DAEMON_OPTS="$DAEMON_OPTS --ip 192.168.*.*"

DAEMON_OPTS="$DAEMON_OPTS --ip 10.*.*.*"

## The proxy port on which the server listens for http requests:

DAEMON_OPTS="$DAEMON_OPTS --port 8085"

fremont conf.d #
```

```
magnolia conf.d # cat http-replicator

## Config file for http-replicator

## sourced by init scripts automatically

## GENERAL_OPTS used by repcacheman

## DAEMON_OPTS used by http-replicator

## Set the cache dir

GENERAL_OPTS="--dir /var/cache/http-replicator"

## Change UID/GID to user after opening the log and pid file.

## 'user' must have read/write access to cache dir:

GENERAL_OPTS="$GENERAL_OPTS --user portage"

## Don't change or comment this out:

DAEMON_OPTS="$GENERAL_OPTS"

## Do you need a proxy to reach the internet?

## This will forward requests to an external proxy server:

## Use one of the following, not both:

#DAEMON_OPTS="$DAEMON_OPTS --external somehost:1234"

#DAEMON_OPTS="$DAEMON_OPTS --external username:password@host:port"

## Local dir to serve clients.  Great for serving binary packages

## See PKDIR and PORTAGE_BINHOST settings in 'man make.conf'

## --alias /path/to/serve:location will make /path/to/serve

## browsable at http://http-replicator.com:port/location

DAEMON_OPTS="$DAEMON_OPTS --alias /usr/portage/packages/All:All"

## Dir to hold the log file:

DAEMON_OPTS="$DAEMON_OPTS --log /var/log/http-replicator.log"

## Make the log messages less and less verbose.

## Up to four times to make it extremely quiet.

#DAEMON_OPTS="$DAEMON_OPTS --quiet"

#DAEMON_OPTS="$DAEMON_OPTS --quiet"

## Make the log messages extra verbose for debugging.

DAEMON_OPTS="$DAEMON_OPTS --debug"

## The ip addresses from which access is allowed. Can be used as many times

## as necessary. Access from localhost is allowed by default.

DAEMON_OPTS="$DAEMON_OPTS --ip 192.168.0.*"

## The proxy port on which the server listens for http requests:

DAEMON_OPTS="$DAEMON_OPTS --port 8080"

magnolia conf.d #
```

Not that magnolia listend on 8080 while fremont listens on 8085 due to a conflict.   (Fremont was only set up as temporary to help in diagnosing the current problem wiht magnolia...)

Regards,

-John

Addendum:

I ran a diff of the two to help spot differences:

```
ajopalm@ballard ajopalm $ diff fremont.conf.d magnolia.conf.d

38c38

< #DAEMON_OPTS="$DAEMON_OPTS --debug"

---

> DAEMON_OPTS="$DAEMON_OPTS --debug"

42,43c42

< DAEMON_OPTS="$DAEMON_OPTS --ip 192.168.*.*"

< DAEMON_OPTS="$DAEMON_OPTS --ip 10.*.*.*"

---

> DAEMON_OPTS="$DAEMON_OPTS --ip 192.168.0.*"

46c45

< DAEMON_OPTS="$DAEMON_OPTS --port 8085"

---

> DAEMON_OPTS="$DAEMON_OPTS --port 8080"
```

Addendum II:

A couple of the differences surprised me, so I edited failed magnolia to be identical to fremont by 

```
changing magnolia to:

DAEMON_OPTS="$DAEMON_OPTS --ip 192.168.*.*"

adding to magnolia:

DAEMON_OPTS="$DAEMON_OPTS --ip 10.*.*.*"

and changing magnolia to:

DAEMON_OPTS="$DAEMON_OPTS --port 8085"
```

This makes the debugging level the only difference between the two.

```
ajopalm@ballard ajopalm $ diff fremont.conf.d magnolia.conf.d

38c38

< #DAEMON_OPTS="$DAEMON_OPTS --debug"

---

> DAEMON_OPTS="$DAEMON_OPTS --debug"
```

My lack of confidence in any of the above three was rewarded when magnolia continued to fail!  :Wink: 

----------

## flybynite

Everything still looks normal.  The request from the client should be normal and http-replicator still shouldn't die if it wasn't.....

I looked again at the traceback.  It shows that the error occurs in Python proper, in asyncore.  The back and forth is just Python checking for data in the socket, finding data is ready, telling http-replicator which says ok give it to me, then Python errors when reading the data from the socket.   Just shouldn't happen....

 *jopalm wrote:*   

> 
> 
> [code]Traceback (most recent call last):
> 
>   File "/usr/lib/python2.3/asyncore.py", line 69, in read
> ...

 

Try two things.  First try to read the http-replicator cache dir from a browser or wget from any client:

[code]

http://magnolia:8080/All

[/code]

This should show the cache listing.

Then try to re-emerge python on the failing server.  Don't use a binary package, you may or may not have used a binary on this box previously, just compile it from source and without distcc.

This box seems to have trouble with Python sockets.  DISTCC could cause problems in certain circumstances.  This could also happen when using binary packages for similar reasons.  The donor box has different settings which could have an effect even though it's not supposed to happen.

At least thats my next theory.....

----------

## jopalm

Well, no joy....

When trying to browse to magnolia and view the cache, we trigger the same error as when a client emerges an ebuild.  Same messages in log and killed the http-replicator process.

Next, I re-emerged python (from source) on the broken magnolia host with no change in behavior during client emerges.

Next, I stopped distccd on magnolia and tried emerging again.  Same results.

At this point I have a proposal.  Unless you still have some ideas you want to follow up on (in which case I'll go at it for as long as you want...) I suggest we set this aside for the time being.

As I'm reorganizing the distribution of services and applications on my network, it's looking more and more like it will be desirable to completely rebuild magnolia.  Since we're both leaning to a lower-level cause of the emerge failures (that is, a systemic problem) I think it might make sense to see if this problem persists after the system rebuild.  With any luck, I should be able to swap in a new disk and expedite this for this weekend.

I really do appreciate all of you help and suggestions.  If you want to chase this a little further, let me know, else I'll follow the above plan and let you know of the results...

Regards,

-John

----------

## flybynite

 *jopalm wrote:*   

> Well, no joy....
> 
>  I suggest we set this aside for the time being.
> 
> 

 

Ok, sorry we couldn't figure it out.  If it isn't Python, it's glibc that defines the socket  :Sad: 

It's good to know that http-replicator does work well on your other boxes  :Smile: 

----------

## jopalm

Well I have good news for a change...

I was able to swap in a new hard drive and get the basic system reinstalled on magnolia (the recalcitrant server) this evening.  One of the first services I activated was http-replicator so that I could build up the new system from my existing cache.

It's working like a charm.  I've synced my other boxes and they're all drawing against the cache on magnolia with no problem.  I'm also able to fire up a browser and view the cache now.

Only hitch *might* be that I'm currently updating world on magnolia.  We'll see if it survives that and still functions after gcc, python, etc are all updated!   :Wink: 

Thanks again for you help!

Regards,

-John

----------

## flybynite

 *jopalm wrote:*   

> Well I have good news for a change...
> 
> I was able to swap in a new hard drive and get the basic system reinstalled on magnolia (the recalcitrant server) this evening.  One of the first services I activated was http-replicator so that I could build up the new system from my existing cache.
> 
> It's working like a charm. 

 

I knew it would work fine  :Wink: 

----------

## zsoltika

Hi 4 everyone,

Just found this topic, and I'm really interested in it, but got one question. 

The story is: some of my workmates using Gentoo (I one of them currently, from the Rad Hat thing, and I'm pretty proud of it). Our sysadm and firm uses internet quota which means every single user have a limited amount of download capabilities.

So if we set this replicator thing the question is:

if the package.(tar.(gz|bz2)|tgz|rpm|.*){1} wouldn't be found on any of the mate's cache neither mine which machine would download it from outer repositories? How to configure to download from the outernet only from the local machine?

Sorry if this was a dumb one...

----------

## flybynite

 *zsoltika wrote:*   

>  How to configure to download from the outernet only from the local machine?
> 
> 

 

You might be able to do what you want, but http-replicator was not designed to work that way.

If you really want to know how I would try it, PM me.

----------

## flybynite

 *zsoltika wrote:*   

> Our sysadm and firm uses internet quota which means every single user have a limited amount of download capabilities.
> 
> So if we set this replicator thing the question is:
> 
> if the package.(tar.(gz|bz2)|tgz|rpm|.*){1} wouldn't be found on any of the mate's cache neither mine -
> ...

 

NOTE:  THIS IS A SPECIAL SETUP AND WON'T WORK FOR NORMAL USE!!!! 

NOTE:  THIS IS A SPECIAL SETUP AND WON'T WORK FOR NORMAL USE!!!! 

NOTE:  THIS IS A SPECIAL SETUP AND WON'T WORK FOR NORMAL USE!!!! 

At first your request was so special I didn't think anyone else would benefit from it.  Then I thought it will show the power and flexibility of http-replicator!!  

The best solution would be to have your firm install http-replicator with it's own download limit.  You and your friends could donate a portion of your limit to it if necessary.

The next best thing is this....

What you want is for each of the gentoo users to install http-replicator on their box.

In each users /etc/conf.d/http-replicator change this line to add a second --alias like so: 

```

DAEMON_OPTS="$DAEMON_OPTS --alias /usr/portage/packages/All:All --alias /var/cache/http-replicator:distfiles"

```

In each users /etc/make.conf set http_proxy="their own http-replicator"

In each users /etc/make.conf set the other users boxes in GENTOO_MIRROR in /etc/make.conf like so:

```

GENTOO_MIRRORS="http:otherbox:8080 http://otherbox2:8080 http://rest of your mirrors"

```

Then it will work as you requested!!  Downloads will come from the cache first, your friends cache next, then your box will download the file from the internet.  All files you've downloaded will be available to your friends.

Whats the trick you ask?  Http-replicator can serve cache content as a normal http server.  The alias option allows requests for different dirs to be served from different dirs on disk.

Setup like this, http-replicator acts like a caching proxy for localhost and as a general http-server for your friends.

Sorry I haven't tested this but I believe it should work....  I know of one problem.  If your friend sets your box as http_proxy, his downloads will come from your box.  I can't think of a way to prevent this right now.  You will of course have logs to prove that he is abusing your download limit.

----------

## zsoltika

First thank for your answer I think we should implement it from monday, and I will write about the tests, thanks again. I really love this community.

 *flybynite wrote:*   

> 
> 
> The best solution would be to have your firm install http-replicator with it's own download limit.  You and your friends could donate a portion of your limit to it if necessary.
> 
> 

 

<ot>

This would be really great. I asked him many times, but he is a red hatted guy. Anyway he's really good kind of a man, except switching from red hat/fedora.

</ot>

Zsoltika

----------

## zsoltika

Is seems to us that it's working, so thanks again flybynite!

Got only one suggestion: it would be great if http-replicator on browser access should sort the accesible files alphabetically.

As I'm not good at python, neither OOP, I just found that this is the code to patch, but don't know what and how:

```
for tail in os.listdir(head or os.curdir): # iterate over directory contents

  if tail.startswith('.'): # don't list hidden files

    continue

  path = os.path.join(head, tail) # create path

  if os.path.isdir(path): # path is a directory; append slash and skip size

    print >> self.counterpart.data, '<a href="%s/%s/">%-63s -' % (self.direct,

tail, tail[:50]+'/</a>'), # ez egy sor a felette levolvel

  else: # path is a file

    print >> self.counterpart.data, '<a href="%s/%s">%-54s %10i' %

(self.direct, tail, tail[:50]+'</a>',  os.path.getsize(path)), # ez is egy sor

    print >> self.counterpart.data, time.ctime(os.path.getmtime(path)) 
```

But it's only a feature request...

----------

## zsoltika

Ok I figured it out, but as I mentioned I'm not a python expert, so possibly this isn't the best solution...

```

zsoltika = os.listdir(head or os.curdir)

zsoltika.sort()

for tail in zsoltika : # iterate over directory contents

  if tail.startswith('.'): # don't list hidden files

    continue

  path = os.path.join(head, tail) # create path

  if os.path.isdir(path): # path is a directory; append slash and skip size

    print >> self.counterpart.data, '<a href="%s/%s/">%-63s -' % (self.direct,

tail, tail[:50]+'/</a>'), # ez egy sor a felette levolvel

  else: # path is a file

    print >> self.counterpart.data, '<a href="%s/%s">%-54s %10i' %

(self.direct, tail, tail[:50]+'</a>',  os.path.getsize(path)), # ez is egy sor

    print >> self.counterpart.data, time.ctime(os.path.getmtime(path)) 
```

After thisediting, the http:// requested page'll be sorted in [0-9A-Za-z] order.

----------

## flybynite

That should work alright...  Thanks for the suggestion!

The file listing feature is really just a byproduct of making http-replicator compatible as a "PORTAGE_BINHOST".  Thus making http-replicator able to serve both as an distfile mirror and also as a binary package mirror.  

Although most users probably use it with just a couple of boxes, my goal is to allow hundreds or thousands of users where I must keep the memory and cpu usage low.

Your probably the first user to actually look at the output, portage doesn't care what order the files are in!!

The file listing feature will probably be optional in the next version, maybe sorting will also be an option  :Smile: 

----------

## zsoltika

 *flybynite wrote:*   

> 
> 
> Your probably the first user to actually look at the output, portage doesn't care what order the files are in!!
> 
> The file listing feature will probably be optional in the next version, maybe sorting will also be an option 

 

Just did that because of the other (non gentoo, so I'm looking forward to convert them all, Gentoo to rule the world  :Very Happy:  ) linux users always downloading some sources (especially the kernel ones), and as I told them to download from me, they had asked for sorting the list. And that "code" is under the same license as yours  :Very Happy: 

Zsoltika

----------

## stodas

This is a fantastic solution to installing and maintaining Gentoo on multiple machines  :Smile: 

Sorry if this has been covered before. There are so many posts I probably missed it  :Embarassed: 

I only have one minor problem. While emerging a large file on a client machine my internet connection dropped out half through the download. After re-establishing the connection and starting emerge again the download just hung while trying to resume. I noticed that http-replicator on the server didn't cache the partly downloaded file. Deleting the partial download on the client has allowed things to get moving again. The problem I have is that my connection is slow and unreliable which makes this situation almost unworkable for large files. Is there any way of getting http-replicator to resume the download of a partially downloaded file?

Thanks for your time  :Smile: 

----------

## flybynite

That is in the works, but complicated.  Until then, change your mirror.  Slow, choked mirrors can give problems even on a fast connection!

----------

## xuttuh

I just want to comment you on this ebuild, this is just a wonderfull program that I couldn't live without for all my boxen.  I have just ran into one problem, I updated to sys-apps/baselayout-1.11.9-r1 and it http-replicator stoped working with this error.

```

>>> Downloading http://gentoo.osuosl.org/distfiles/eject-2.0.13.tar.gz

--01:24:48--  http://gentoo.osuosl.org/distfiles/eject-2.0.13.tar.gz

           => `/usr/portage/distfiles/eject-2.0.13.tar.gz'

Connecting to 192.168.1.1:8080... failed: Invalid argument.

Retrying.

```

here are some of the revalent packages I have installed.

```

[ebuild   R   ] sys-apps/portage-2.0.51.19  -build -debug (-selinux) 277 kB

[ebuild   R   ] sys-apps/baselayout-1.11.9-r1  -bootstrap -build -debug -livecd -static (-uclibc) 158 kB

[ebuild   R   ] net-misc/http-replicator-3.0  0 kB

```

Here is my /etc/conf.d/http-replicator

```

GENERAL_OPTS="--dir /var/cache/http-replicator"

GENERAL_OPTS="$GENERAL_OPTS --user portage"

DAEMON_OPTS="$GENERAL_OPTS"

DAEMON_OPTS="$DAEMON_OPTS --alias /usr/portage/packages/All:All"

DAEMON_OPTS="$DAEMON_OPTS --log /var/log/http-replicator.log"

DAEMON_OPTS="$DAEMON_OPTS --debug"

DAEMON_OPTS="$DAEMON_OPTS --ip 192.168.1.*"

DAEMON_OPTS="$DAEMON_OPTS --port 8080"

```

I would post http-replicator's log but there is nothing in there but it starting and stoping since I emerged sys-apps/baselayout-1.11.9-r1.

I would downgrade the baselayout to the "stable" one in portage but I am using the new wireless networking features.

Any help in this would be great.

Timothy

----------

## flybynite

 *xuttuh wrote:*   

> I just want to comment you on this ebuild, this is just a wonderfull program that I couldn't live without for all my boxen.  I have just ran into one problem, I updated to sys-apps/baselayout-1.11.9-r1 and it http-replicator stoped working with this error.
> 
> 

 

Thanks!

 *xuttuh wrote:*   

> 
> 
> ```
> 
> >>> Downloading http://gentoo.osuosl.org/distfiles/eject-2.0.13.tar.gz
> ...

 

Although it's possible some change did break something, my first guess is that is not the problem.

```

root # etcat -v baselayout

[ Results for search key           : baselayout ]

[ Candidate applications found : 8 ]

 Only printing found installed programs.

*  sys-apps/baselayout :

        [  I] 1.9.4-r6 (0)

        [M  ] 1.9.4-r7 (0)

        [M  ] 1.11.8-r3 (0)

        [M~ ] 1.11.9-r1 (0)

        [M~ ] 1.11.10 (0)

        [M~ ] 1.11.10-r1 (0)

        [M~ ] 1.11.10-r2 (0)

        [M~ ] 1.12.0_alpha1-r2 (0)

```

With all these versions of baselayout it's a little hard to test them all  :Smile:   And its possible you updated something else besides baselayout?

Either way we can fix this if you are willing to help!!

1.  Did you update the server box and the clients or both?

2.  Reboot both client and server boxes if you haven't.  Linux usually needs a logout and log back in but some changes like the kernel and baselayout are so deep that a reboot is the best test.

3.  What happens when you browse the cache with firefox?

http://192.168.1.1:8080/

http://192.168.1.1:8080/All

----------

## xuttuh

After sitting down and thinking what I changed in the past 3 days, I realized that I removed net.lo from startup.   :Embarassed:    Once I started it back up, everything worked fine again.  

Thanks flybynite for proding me to take to the time to think of the unobvious (or my stupidity whatever we want to call it  :Very Happy: ).

EDIT: really bad grammer.

Timothy

----------

## al

Can i just say thanks for this HowTo.

I have been using Lan-Http-Replicator probably 6 months now on my Gentoo Server and have 4 other Gentoo boxes requesting files from it and everthing works flawlessly.

Its great to emerge something and see it arriving at 7MB/s instead of 56KB/s.

I also have all my boxes requesting web pages through Squid so if you guys noticed the internet has been a bit quicker lately its down to me.

 :Very Happy:   :Very Happy: 

----------

## hielvc

Another domo arigato  :Exclamation:  and if you dont understand my version of japness,  Thanks.

----------

## obsidianblackhawk

first of all Flybynite, awesome program thanx.  My question is this.

Is there any way to make it so that after installing a program on the client box, that the clients disfiles directory is cleaned out as well.  You know what i mean?  

Emerge a program, http-replicator gets the file, my client gets the file, but i only want the file premanatly stored on the server running http-replicator.i ewant it removed from the client after it is installed.

Is there any way to do this automatically?

----------

## zsoltika

 *obsidianblackhawk wrote:*   

> Emerge a program, http-replicator gets the file, my client gets the file, but i only want the file premanatly stored on the server running http-replicator.i ewant it removed from the client after it is installed.
> 
> Is there any way to do this automatically?

 

Not an elegant solution but why not do this:

```
emerge -<flags> <package|world|system> && rm /usr/portage/distfiles/*.*
```

But should be much more simple way (as far as I see) to mount the server's /var/cache/http-replicator directory via samba or nfs.

So the server will do an 

```
emerge -DNuf world && repcacheman
```

then all the files from the servers distfiles dir will be move to the cache and will be available to clients.  

Just ignore it if it's dumb.

----------

## BolO

Hi there, am busy setting up your system but when I try to start the service I get 

 *Quote:*   

> * Caching service dependencies ...
> 
>  *  Service 'dnsmasq' already provided by 'dns'!;
> 
>  *  Not adding service 'named'...                                                                                                                       [ ok ]
> ...

 

Can anyone help me on this?>

----------

## blotto

Hi 

Just set up lan http-replicator and rsync mirror which works OK - except for one small prob.

When doing an emerge from another machine the download from the cache stops before completing

On 2 occasions it stalled at 93% and the corresponding log entry for http-replicator was:-

 *Quote:*   

> 20 Apr 2005 23:42:27 INFO: HttpClient 50 proxy request for http://ftp.gentoo.skynet.be/pub/gentoo/distfiles/dialog_1.0-20050206.orig.tar.gz
> 
> 20 Apr 2005 23:42:27 INFO: HttpServer 50 serving file from cache
> 
> 20 Apr 2005 23:42:27 STAT: HttpClient 50 received 299742 bytes

 

It does not seem to be size related since the time before this it was qt which is 14M+

Anyone else seen this ?

----------

## flybynite

 *blotto wrote:*   

> Hi 
> 
> On 2 occasions it stalled at 93% and the corresponding log entry for http-replicator was:-
> 
>  *Quote:*   
> ...

 

This is caused by a corrupt file in replicators cache.  It's not supposed to happen, but it does happen.  I can't duplicate the problem so it must be just a few broken http servers or a bizarre network error that causes this.

The fix will be to have repcacheman check the cache and the distfile dir.  This is due in the next version.

In the mean time you can force repcacheman to check all the cache or just delete the offending files from the cache.

If your cache and the distfile dir are on the same partition this is quick, if different then it just takes longer.

```

mv /var/cache/http-replicator/* /usr/portage/distfiles/

repcacheman

rm /usr/portage/distfiles/*

```

Last edited by flybynite on Sun Apr 24, 2005 4:33 pm; edited 1 time in total

----------

## flybynite

 *BolO wrote:*   

> Hi there, am busy setting up your system but when I try to start the service I get 
> 
>  *Quote:*   
> 
> http-replicator: error: port 8080 is not available
> ...

 

You have another program using port 8080.  It could be a second copy of http-replicator or another program.

Run this to show port usage:

```

netstat -pletu

```

----------

## blotto

 *flybynite wrote:*   

> 
> 
> This is caused by a corrupt file in replicators cache.  It's not supposed to happen, but it does happen.  I can't duplicate the problem so it must be just a few broken http servers or a bizarre network error that causes this.
> 
> The fix will be to have repcacheman check the cache and the distfile dir.  This is due in the next version.
> ...

 

Hi again, tried your recommendations but repcacheman said "no dupes and none corrupt" and it still does it !

----------

## killercow

Hi,

Im wondering if this program could be used to do the following:

Create a single point for portage to store all of the synced files. Probably mounted trough NFS/SAMBA, or queried by a special portage clone.

Create a single server/machine who would handle compiling programms (possible helped by other machines trough distcc and ccache)

Create a way to handle different arch and make options on one server.

This would be the sweetest option available for things like clusters/schools/internet cafe's.

Since most of these situations can't have:

their nodes compile things for them selves,

And have their nodes configured in a couple of different ways.

And Ususally have a couple of nodes per config/arch set.

And only need one portage cache.

This would imply that any machine in the lan could just do an emerge or emerge world -up, and it would go query the shared portage tree for its packages depening on tis own config set.

It would then ask the http_proxy thingy for the binay packages for its arch and make config.

The proxy would then either give that package, so it can be installed, Or it could start compiling it (maybe with the help of other machines (including the requestor) trought distcc)

Since this would also imply that some packages will be build with different use/make flags but would not be different (eg the make fags don't apply) it would be handled the simpest by using a large +2GB? ccache.

Any other thoughts on this?

I would use it for my cluster consisting of 5 servers all configured the same., 1 other server, which only has two other make flags.

And one completely different server, (different arch, and setup, because its the fileserver)

This would help my cluster to get managed the way a setup like that should, (just call the emerge world -up command on every machine, and im set,)

And if i need a special package on ony of the nodes, it could just emerge it, and it would get recorded in that node's world file.

----------

## bigmacx

I'm using the http-replicator for distfile sharing among about 5 pc's. This thread has been very helpful, but I have one question about the usage of http-replicator.

In using the http-replicator service, the source tarballs are still downloaded to the local client, just from a LAN source, not neccessarily the Internet. So there would appear to be a concern about the local client disk filling up. I imagine an easy fix for that would be to run an "rm -r -f /usr/portage/distfiles/*" after an "emerge --update world".

Q1: Is there any emerge or portage command to automaticaly delete the sources after emerge-ing?

In the HOWTO, there is mention of occassional running of the "replcacheman" command. From what I can tell, this reccomendation concerns only the http-replicator server, not the clients. And from what I understand, "replcachman" moves all valid packages from dstfiles to http-replicator-cache.

Q2: Can the http-replicator server use itself also as a proxy for the dstfiles and eliminate the need to run "replcacheman" repeatedly beyond the inital install?

Previously, I used nfs/samba for sharing the dstfiles among these 5 pc's and did not have a local space concern for the dstfiles. The http-replicator, combined with the local rsyncd server described elsewhere, makes for a GREAT automation system for locally administrering Gentoo linux PC's!!!!!!

Now if I could just find something to help with the big-picture administration workflow of testing the updates on an exact clone of the current system BEFORE adding the updates to the production install. I get a feeling that automatically rsync-ing and "emerge --update world" creates production systems that are just a tad bit more reliable than auto-WindowsUpdate. :Embarassed: 

----------

## zsoltika

 *bigmacx wrote:*   

> Q1: Is there any emerge or portage command to automaticaly delete the sources after emerge-ing?

 

You can set up a really "complex" shell script like this:

```
#!/bin/bash

/usr/bin/emerge $@ && rm /usr/portage/distfiles/*.*
```

Then save it to somewhere for example Emerge.sh end set up an alias for it: 

```
alias emerge='/whereyousavedit/Emerge.sh'
```

AFAIK it's a feature request in the bugzilla if emerge should remove the sources (from the distfiles dir) after a succesful install.

 *bigmacx wrote:*   

> Q2: Can the http-replicator server use itself also as a proxy for the dstfiles and eliminate the need to run "replcacheman" repeatedly beyond the inital install?

 

Sorry it should be my english, but I don't understand what you meant.[/code]

----------

## bigmacx

 *zsoltika wrote:*   

>  *bigmacx wrote:*   Q2: Can the http-replicator server use itself also as a proxy for the dstfiles and eliminate the need to run "replcacheman" repeatedly beyond the inital install? 
> 
> Sorry it should be my english, but I don't understand what you meant.[/code]

 Thanks for script reccomendation.

For Q2, I understand that when the http-replicator gets a request from the client, that it transfers the source file to the client's local dstfile directory from the HR cache directory. And if that file is not in the cache, it downloads the file first. 

Now, for the http-replicator server itself, if the HR server is not setup to use it's own proxy (the PROXY= line in the make.conf), then the HR server will download the source files directly from the Internet to the HR server's dstfile directory.

The way I read the HOWTO, the above statement is the entire reason for running the "replcacheman" command at all beyond the initial install.

So my thought was to configure the HR server to use the HR proxy. That way, emerge's on the HR server would populate the HR cache the same way the clients do and I could just delete the HR server's distfiles/* the same way I would on the clients.

----------

## pHeel

Ok just wanted to ask if its my box or is there an issue with the emerge of this. 

This is the attempt at emerge yields: 

 *Quote:*   

> 
> 
> emerge http-replicator
> 
> Calculating dependencies ...done!
> ...

 

Multiple attempts same result. Anyone have any ideas?

----------

## assaf

If you're sure you have the latest ebuild you can do the following as root:

```

ebuild [full_path_to_ebuild_file] digest

```

----------

## pHeel

Thanks seems to have worked  will post either way once I know. Still learning all the odds and ends of this distro that one was a totally new one to me.

Thanks again  :Very Happy: 

----------

## zsoltika

 *bigmacx wrote:*   

> For Q2, I understand that when the http-replicator gets a request from the client, that it transfers the source file to the client's local dstfile directory from the HR cache directory. And if that file is not in the cache, it downloads the file first. 
> 
> Now, for the http-replicator server itself, if the HR server is not setup to use it's own proxy (the PROXY= line in the make.conf), then the HR server will download the source files directly from the Internet to the HR server's dstfile directory.
> 
> So my thought was to configure the HR server to use the HR proxy. That way, emerge's on the HR server would populate the HR cache the same way the clients do and I could just delete the HR server's distfiles/* the same way I would on the clients.

 

As far as I understand this the two solutions you have (on the server):

1) set up another script 

(this way you still have to run repcacheman)

```
#!/bin/bash

/usr/bin/emerge $@ && /usr/bin/repcacheman
```

2. set $DISTDIR in make.conf

this should work but it would be good if your clients don't download while the server downloads, so set $DISTDIR (in /etc/make.conf) to the same dir as HR's cache

HTH,

Zsoltika

----------

## killercow

Did anyone read my post? or did i ask to much black magic and voodoo from you guys?

I'd really love to have the features i discribed in my previous post, and since i don;t think portage will be changed to include them anytime soon, i'd better see what other projects might support my idea's.

The idea of install-fests also noted in this thread would also be solved with this.

----------

## bigmacx

 *zsoltika wrote:*   

> 
> 
> 1) set up another script 
> 
> (this way you still have to run repcacheman)
> ...

 The #1 I understand. The #2 causes the sharing problems similar to the other, more basic, dstfile sharing by nfs/samba.

I think maybe I'm confused or simply not being clear enough. Thanks for your continued help zsoltika!

Does anyone know if its safe to just setup the http-replicator server to use the http-replicator proxy itself by configuring the make.conf on the http-replicator server in the same manner that a client is configured?

----------

## zsoltika

 *bigmacx wrote:*   

>  *zsoltika wrote:*   
> 
> 2. set $DISTDIR in make.conf
> 
> this should work but it would be good if your clients don't download while the server downloads, so set $DISTDIR (in /etc/make.conf) to the same dir as HR's cache
> ...

 

OK I'm reading your blue question, silly me for misunderstanding.

As the great flybynite explained it in a previous post it could work (of course), so here is a big thank again for flybynite's great work.

On our workplace three of us using http-replicator as explained there, every one of us is a server for the two others and for our own machines. Your case is much more simplier as I see: set up the clients to use the server, then set http_proxy to localhost:<port>.

Then the server will use it's own proxy. I don't know how it handles the packages missing from the cache (it will simply download it to ..distfiles/ and/or holds a copy in the cache but it should be done in the (http-replicator) script IMHO, correct me if I'm wrong.

We (in my workplace) using the same method: from cron syncing the tree, then 'emerge -uDfN world' then run repcacheman (this is one script), and it just works (in the nights...)

----------

## thecooptoo

```
Connecting to 192.168.0.10:12000... failed: Connection refused.

!!! Couldn't download itcl3.2.1_src.tgz. Aborting.

```

```
bash-2.05b# cat /etc/make.conf |grep 12000

http_proxy=http://192.168.0.10:12000

bash-2.05b#               
```

ive opened a port on the firewall 

```

router root # /etc/init.d/http-replicator status

 * status:  started

router root # cat /etc/shorewall/rules |grep 12000

ACCEPT loc              fw              tcp     12000

```

where is it being refused ?

EDIT

this might get confusing  - ive started this as a new thread in portage&programming

----------

## assaf

What port and IP did you set in the conf.d file?

Does it work when the firewall is down?

----------

## Master One

The http-replicator is working great , but I see one unsolved issue:

If I successfully emerge a package on the server, followed by repcacheman, the DISTDIR is empty again of course, because all files downloaded by the http-replicator server get moved to the cache-dir, as intended.

If I now for some reason have to reemerge any of the already installed packages on the server, it will download the file again, because it only looks in DISTDIR, if the file is already there, and not in the cache-dir.

Is there any solution for this problem?

I think this is the major downside, to have a DISTDIR and a separate cache-dir.

Is there still no solution, for having DISTDIR=cache-dir? This would make it so much easier, also to integrate deltup the easy way.

----------

## zsoltika

 *Master One wrote:*   

> If I now for some reason have to reemerge any of the already installed packages on the server, it will download the file again, because it only looks in DISTDIR, if the file is already there, and not in the cache-dir.
> 
> Is there any solution for this problem?

 

Check flybynite's post about it ...

----------

## Master One

I could not find any related into following the link you provided, zsoltika.

But I already found the solution for the mentioned issue: localhost has to be defined as http_proxy in make.conf on the server machine, then it will download the desired file from the cache-dir of http-replicator to DISTDIR.  :Wink: 

----------

## zsoltika

 *Master One wrote:*   

> I could not find any related into following the link you provided, zsoltika.

 

In that post flybynite points to: *Quote:*   

> 
> 
> In each users /etc/make.conf set http_proxy="their own http-replicator"... Then it will work as you requested!! Downloads will come from the cache first, your friends cache next, then your box will download the file from the internet. All files you've downloaded will be available to your friends. 
> 
> Whats the trick you ask? Http-replicator can serve cache content as a normal http server. The alias option allows requests for different dirs to be served from different dirs on disk. 

 

Cheers,

Zsoltika

----------

## assaf

I'm also using http-replicator in a symmetric configurations suggested on page 12 (two machines on a LAN that try to download from each other before going to the net). It works fine and I think it should be mentioned in the first post as it is a very common situation. Not everyone has a gateway server which is always on.

Enhancement request:

I want to be able to configure the proxy to only look in the other server during specific time ranges in a day (i.e. I don't want to download stuff from my office computer to my home computer after working hours because that may be considered abuse, or perhaps on weekends the computer is off and i'm needlessly waiting for the connection to time out before it goes to the next mirror)

----------

## Master One

 *zsoltika wrote:*   

>  *Master One wrote:*   I could not find any related into following the link you provided, zsoltika. 
> 
> In that post flybynite points to: *Quote:*   
> 
> In each users /etc/make.conf set http_proxy="their own http-replicator"... Then it will work as you requested!! Downloads will come from the cache first, your friends cache next, then your box will download the file from the internet. All files you've downloaded will be available to your friends. 
> ...

 

Ups, sorry, zsoltika, you were right of course, I seem to be a little distracted today...

----------

## assaf

 *assaf wrote:*   

> 
> 
> Enhancement request:
> 
> I want to be able to configure the proxy to only look in the other server during specific time ranges in a day (i.e. I don't want to download stuff from my office computer to my home computer after working hours because that may be considered abuse, or perhaps on weekends the computer is off and i'm needlessly waiting for the connection to time out before it goes to the next mirror)

 

I've added --timeout=15 to the FETCH_COMMAND to prevent waiting on a connection to the peer when its not there, but it would still be nice if there were a way to configure this for the proxy.

----------

## flybynite

Sorry it took so long to get back to you, I travel frequently and don't always have good net access.   I'll try to answer the questions still remaining.....

 *bigmacx wrote:*   

> 
> 
> Does anyone know if its safe to just setup the http-replicator server to use the http-replicator proxy itself by configuring the make.conf on the http-replicator server in the same manner that a client is configured?

 

Yes, read the howto again....

 *Quote:*   

> 
> 
> 2. Modify /etc/make.conf on both the server and your other gentoo boxes. 
> 
> 

 

Portage on the http-replicator server doesn't even know http-replicator is on the same box.  Portage on the client and server work exactly the same and are setup exactly the same!

----------

## flybynite

 *thecooptoo wrote:*   

> 
> 
> ```
> Connecting to 192.168.0.10:12000... failed: Connection refused.
> 
> ...

 

What does /var/log/http-replicator.log say?  Are you sure http-replicator is running?

----------

## flybynite

 *Master One wrote:*   

> 
> 
> I think this is the major downside, to have a DISTDIR and a separate cache-dir.
> 
> Is there still no solution, for having DISTDIR=cache-dir? This would make it so much easier, also to integrate deltup the easy way.

 

This comes up from time to time...

I don't see having a separate cache dir as a problem, but as a feature  :Smile: 

1.  http-replicator can run on "other" distro's and doesn't need gentoo.  This is a reality because many situations at work, school etc you don't own the LAN and may face resistance in converting everything to gentoo  :Smile:   Http-replicator actually started on and runs just fine on debian, but I don't like to mention that.......(Sorry Gertjan..)

2.  Portage thinks it owns the DISTDIR and I can't change portage.  Since portage isn't designed to share, it is a really bad neighbor and leaves corrupt, incomplete and just plain garbage in the DISTDIR.  Imagine  http-replicator downloading and saving file.tar.gz while portage is downloading and saving file.tar.gz in the same dir????  

3.  This situation only exists on a gentoo box running http-replicator.  My support script repcacheman handles this exact situation and makes it a non event.  After repcacheman runs, there are no dups, no wasted space and no problems  :Smile: 

I don't know what the problem is with deltup you mentioned. 

Http-replicator has an "alias" feature that serves any dir as a standard http server.  This is how it serves binary packages as a PORTAGE_BINHOST and can serve anything else you want from any dir you want.Last edited by flybynite on Mon May 09, 2005 4:08 am; edited 1 time in total

----------

## flybynite

 *assaf wrote:*   

> I'm also using http-replicator in a symmetric configurations suggested on page 12 (two machines on a LAN that try to download from each other before going to the net). It works fine and I think it should be mentioned in the first post as it is a very common situation. Not everyone has a gateway server which is always on.
> 
> 

 

I didn't realize that might be a common situation.  I might add a note to the howto..

 *assaf wrote:*   

> Enhancement request:
> 
> I want to be able to configure the proxy to only look in the other server during specific time ranges in a day (i.e. I don't want to download stuff from my office computer to my home computer after working hours because that may be considered abuse, or perhaps on weekends the computer is off and i'm needlessly waiting for the connection to time out before it goes to the next mirror)

 

I'm not sure how many others would need that feature.  Actually I can't imagine why you would ever want to use your work computer as a cache when your at home?  I'm assuming that your internet connection to your work computer isn't any faster than your internet connection to the other mirrors.  I guess you would be saving the official mirrors bandwidth....Last edited by flybynite on Mon May 09, 2005 4:12 am; edited 1 time in total

----------

## Gherald

 *assaf wrote:*   

> Enhancement request:
> 
> I want to be able to configure the proxy to only look in the other server during specific time ranges in a day (i.e. I don't want to download stuff from my office computer to my home computer after working hours because that may be considered abuse, or perhaps on weekends the computer is off and i'm needlessly waiting for the connection to time out before it goes to the next mirror)

 

Simply use a bash script or alias that exports alternate environment variables, and use it to wrap emerge.

----------

## assaf

 *Gherald wrote:*   

>  *assaf wrote:*   Enhancement request:
> 
> I want to be able to configure the proxy to only look in the other server during specific time ranges in a day (i.e. I don't want to download stuff from my office computer to my home computer after working hours because that may be considered abuse, or perhaps on weekends the computer is off and i'm needlessly waiting for the connection to time out before it goes to the next mirror) 
> 
> Simply use a bash script or alias that exports alternate environment variables, and use it to wrap emerge.

 

That's a pretty good idea. Thanks.

----------

## assaf

Wouldn't it make sense for http-replicator to stop downloading a file if there are no clients requesting it, i.e. if i started emerging something and aborted, the file will continue to download (what should I do then if i really don't want the file - restart http-replicator?). It can always resume downloading when someone requests the file again.

----------

## Master One

 *flybynite wrote:*   

>  *Master One wrote:*   
> 
> I think this is the major downside, to have a DISTDIR and a separate cache-dir.
> 
> Is there still no solution, for having DISTDIR=cache-dir? This would make it so much easier, also to integrate deltup the easy way. 
> ...

 

Ok, convinced  :Wink: 

 *flybynite wrote:*   

> I don't know what the problem is with deltup you mentioned.

 

It doesn't work, I played arround to combine http-replicator & deltup for two days, but it's a no go. I had it up and running, so that when trying to emerge something on the machine running http-replicator, it downloaded just the delta files and build the resulting archive, but this does not work out when trying to emerge anything from a client machine, which is not already in the http-replicator-cache, because otherwise http-replicator starts downloading the requested file(s) for itself with it's own download routine. I think this problem only can be solved, if http-replicator would use portage's FETCHCOMMAND defined in make.conf (because this is the only way, how the getdelta.sh script could be involved) instead of its own download routine.

Any chance for an implementation?

The combination of http-replicator & deltup would be the ultimate download solution, it was amazing to see the saved download data volume when trying deltup.

----------

## bigmacx

 *flybynite wrote:*   

> Yes, read the howto again....
> 
>  *Quote:*   
> 
> 2. Modify /etc/make.conf on both the server and your other gentoo boxes. 
> ...

 Yes, clearly it does say that. Thanks for pointing it out.

I think I've got my situation handled now. I'll explain my present configuration in order to help others implement a fully contained setup and entertain possible constructive comments on how to improve this arrangement.

BACKGROUND: I wanted to have a method to auto-update my gentoo PC's. That is, auto emerge-sync and auto emerge-update. I also wanted to get email'd status messages back on results and to use a shared dstfiles location. For me, I used built-in gentoo commands, custom scripts, fcron, postfix, http-replicator, and rsync. The first 4 items are beyond the scope of this thread, so I'll just show those entries and describe the http-replicator and rsync parts.

On the Server:

1.   Install fcron, postfix, http-replicator.

2.   Configure http-replicator server as per howto.

3.   Configure fcron normally.

4.   Configure postfix as appropriate (ssmtp will work).

5.   Add this kind of cron entry for root.

/etc/crontab

```
&bootrun(true),exesev(false),first(5),mail(true),mailto(xxxxx@comcast.net),serial(true) 0 3 * * *      /etc/scripts/emergesync

&bootrun(true),exesev(false),first(15),mail(true),mailto(xxxxx@comcast.net),serial(true) 0 5 * * *      /etc/scripts/emergeupdate
```

These lines are almost all about fcron. Basically, I wanted the emerge scripts to run everyday and to run if the machine was down overnight. The important part here is that the server was configured to "emerge sync" at 3AM and "emerge update" at 5AM.

6.   Add the 2 scripts to /etc/scripts:

/etc/scripts/emergesync

```
#!/bin/sh

emerge sync --nospinner
```

/etc/scripts/emergeupdate

```
#!/bin/sh

emerge -p --nospinner --update world &&

emerge --nospinner --update world &&

repcacheman &&

rm -vrf /usr/portage/distfiles/* 

rm -vrf /var/tmp/portage/* 

```

This script does the actual updating and cleans up portage aftermath.

7.   Configure rsyncd on the server:

/etc/rsync/rsyncd.conf

```
[cache]

path = /var/cache/http-replicator

comment = Gentoo Linux dstfiles mirror

read only = false
```

This configures the rsyncd server to allow clients to upload to the rsyncd server. There is much more to configuring the rsyncd server and is documented in this thread:

https://forums.gentoo.org/viewtopic.php?t=59134&highlight=rsync

8.   Start all server services and verify proper operation.

On each Client:

1.   Install fcron, postfix.

2.   Copy these files from the http-replicator server to the client (preserve directory structure):

```
/etc/conf.d/http-replicator

/usr/bin/repcacheman

/usr/bin/repcacheman.py
```

You could just as well install the full http-replicator package on the client but I just needed these 3 files.

3.   Configure fcron normally.

4.   Configure postfix as appropriate (ssmtp will work).

5.   Add this kind of cron entry for root.

/etc/crontab

```
&bootrun(true),exesev(false),first(5),mail(true),mailto(xxxxx@comcast.net),serial(true) 0 4 * * *      /etc/scripts/emergesync

&bootrun(true),exesev(false),first(15),mail(true),mailto(xxxxx@comcast.net),serial(true) 0 5 * * *      /etc/scripts/emergeupdate
```

Same file basically. Important part here is that the client was configured to "emerge sync" at 4AM and "emerge update" at 5AM.

6.   Add the 2 scripts to /etc/scripts:

/etc/scripts/emergesync

```
#!/bin/sh

emerge sync --nospinner
```

/etc/scripts/emergeupdate

```
#!/bin/sh

emerge -p --nospinner --update world &&

emerge --nospinner --update world &&

repcacheman &&

rm -vrf /usr/portage/distfiles/* 

rm -vrf /var/tmp/portage/* 

rsync /var/cache/http-replicator/* rsync://xxxxx/cache

rm -vrf /var/cache/http-replicator/*
```

This script does the actual updating and cleans up portage aftermath. On the clients, repcacheman moves valid source files into /var/cache/http-replicator and then rsync copies them up to the central source location.

7.   Start all server services and verify proper operation.

COMMENTS:

This is working for me now. I haven't uber engineered all the aspects, so there may well be more effective ways to do parts of the above. If you see any improvements that can be made, please post them!

The way I read the original HOWTO, there was an emphasis on repeated running of repcacheman. This is mentioned as important to do on the server, but not mentioned concerning the clients.

Now I understand the continued need to run repcacheman on the server is due to server emerges. After the server emerges, it is possible that not all files used during the emerge will be in the cache, but will be left in the dstfiles directory. So running repcacheman after a server based emerge moves these "finds" into the http-replicator cache.

What function will these newly moved files have in the future? Doubtful. The original server emerge, using the server as it's own proxy, did not populate the cache directory with these files, but did populate the dstfile directory. How would any future user of the cache benefit from these new files? It would seem that the client would use whatever out-of-band method to get these files just like the server did.

So I assume in this case, that the continued use of repcacheman is just to centralize all the sources. This is my GOAL here. And having assumed that, I wondered: if we need to run repcacheman on the server, what about the clients? Well, you need to handle that case also to achieve the goal.

Others in this thread have mentioned running multiple copies of http-replicator on several PC's. I wanted a simple client-server setup and just installed the parts of http-replicator that I needed to make the clients work.

I used rsync to upload back to the server because I wanted to stay away from shared nfs/samba source directories and I hope that rsynd will handle the multiple client update problem inherent with just sharing dstfiles over nfs/samba.

Thanks for your work flybynite and all the others in this thread. Any constructive feedback would be greatly appreciated!

----------

## assaf

I don't know if this has been reported yet, but there seems to be a problem when trying to emerge fetch restricted packages (e.g. sun-jdk). The distfile must be placed in /usr/portage/distfiles for the emerge to continue, and http-replicator is completely out of the loop because portage doesn't even try to download the file...

Of course it's easy to workaround but...

----------

## Gherald

Fetch restricted pakcages must be downloaded manually on each machine.  This is expected behavior.

----------

## assaf

 *Gherald wrote:*   

> Fetch restricted pakcages must be downloaded manually on each machine.  This is expected behavior.

 

Of course they need to be downloaded manually, but you must emerge them right after downloading and before running repcacheman otherwise the file would be moved to /var/cache/... and won't be found by portage.

(Of course you could run http_proxy="" emerge ...)

----------

## flybynite

 *assaf wrote:*   

> Wouldn't it make sense for http-replicator to stop downloading a file if there are no clients requesting it

 

I know what you mean.  It is because of the http-replicator uses python's asyncore module.  This allows background downloading without using threads.  Http-replicator doesn't know when or if any client connects or disconnects because asyncore doesn't pass this info.  Adding it might be possible, but lowlevel network ops are hard to debug and make bulletproof.

A custom asyncore module is being looked at, this may be possible in the future.

----------

## flybynite

 *Master One wrote:*   

> 
> 
>  I think this problem only can be solved, if http-replicator would use portage's FETCHCOMMAND defined in make.conf (because this is the only way, how the getdelta.sh script could be involved) instead of its own download routine.
> 
> Any chance for an implementation?
> ...

 

I've tried to use wget to download packages but it doesn't work.  I can't get any general purpose download program to give feedback on errors and low level data control that will work with http-replicators needs.

It sounds to me like getdelta needs to have some network functions built in...  

What might work for now is to have the getdelta.sh script fetch the files and then upload the files back to the http-replicator server for others?

----------

## flybynite

 *bigmacx wrote:*   

> 
> 
> The way I read the original HOWTO, there was an emphasis on repeated running of repcacheman. This is mentioned as important to do on the server, but not mentioned concerning the clients.
> 
> 

 

repcacheman isn't installed on the clients and isn't needed at all on the clients.  The only time it needs to run on the server is after an emerge on the server.

 *bigmacx wrote:*   

> 
> 
> Now I understand the continued need to run repcacheman on the server is due to server emerges. After the server emerges, it is possible that not all files used during the emerge will be in the cache, but will be left in the dstfiles directory. So running repcacheman after a server based emerge moves these "finds" into the http-replicator cache.
> 
> 

 

True

 *bigmacx wrote:*   

> 
> 
>  It would seem that the client would use whatever out-of-band method to get these files just like the server did.
> 
> 

 

Not true.  

There are several ways files can get to distifiles and not go through http-replicator.

1.  FTP is the most obvious.  http-replicator currently doesn't work at all with ftp mirrors.  I've found some new packages only available on the authors ftp site at first, then eventually will be mirrored on an http mirror, probably within hours of release.  Clients will check your http mirrors first so http-replicator can serve these files after repcacheman moves them to the cache.

2.  Portage has a "Local Mirror" override option for some ebuilds with only FTP sources or RESTRICT=nomirror.  

From the HOWTO:

 *Quote:*   

> Also, some packages in portage have a RESTRICT="nomirror" option which will prevent portage from checking replicator for those packages. The following will override this behavior. Create the file "/etc/portage/mirrors" containing:
> 
> ```
> 
> # Http-Replicator Override for FTP and RESTRICT="nomirror packages
> ...

 

What I can't override is the RESTRICT=nofetch option.  Things like sun's jdk.  You have to go and click some license or something so portage will never download this package.  There is no override in portage for this and it frustrates all admins of multiple boxen.

However if you run repcacheman, this package will be moved into the replicator's cache and can be downloaded from there.  All replicator's cache can be browsed and downloaded from http://replicatorbox:8080/ or available elsewhere if you set the correct alias in the options.

In the SPECIAL setup your trying to run,  repcacheman is necessary because portage leaves borked, incomplete and corrupt files in distfiles all the time.  In your situation, repcacheman will checksum the files preventing junk from getting into the cache.

My only comment on your setup is that it isn't really a good idea to do mass unattended portage updates. 

Unfortunately, some updates will break your system or other packages.  The developers are getting much better, but it can and has happened to me.

What I do is update one box first, setting portage to automatically build binary packages.  Then after a while, I update other boxes using the binaries I've created.    This saves time and prevents alot of heat in my laptops!  Since all my boxes aren't the same arch, some still just download the distfiles from http-replicator and do their own compiling.

I might update firefox to other boxes right after I launch the new version.  In the case of GCC, glibc, or other system package I will wait days before upgrading all other boxes.

----------

## bigmacx

 *flybynite wrote:*   

> 
> 
> repcacheman isn't installed on the clients and isn't needed at all on the clients.  The only time it needs to run on the server is after an emerge on the server.
> 
> ...
> ...

 

I know my post count is down on this forum, but I find it slightly amusing that everyone responding to my post dismisses the validity of the question and gives a trivial answer.

So then lets see:

1.   http-replicator Clients do emerges just like http-replicator Servers ---------->CHECK

2.   http-replicator Clients can install packages which are not simultaneously or previously installed on http-replicator Server ---------->CHECK

3.   If http-replicator Client emerges a packages that uses non-http-replicator-Server methods to download the package source, the package source will be in the http-replicator Client's local dstfile directory ---------->CHECK

4.   So then, this package source will never get to http-replicator Server's cache ---------->CHECK

5.   Putting repcacheman and rsync on http-replicator Client solves this condition ---------->CHECK

 *flybynite wrote:*   

> My only comment on your setup is that it isn't really a good idea to do mass unattended portage updates. 
> 
> Unfortunately, some updates will break your system or other packages.  The developers are getting much better, but it can and has happened to me.
> 
> 

 

I'm aware of this problem. There is also the issue of the *._cfgxxxxx" files and etc-update. Right now, I'm just trying to prototype this whole auto-emerge portion.

I have plans on trying to create a patching system similar to what we use at work with the Windows servers. We use vmware and snapshotting to duplicate our production environmet into a test environment, then patch and test the servers, and move the patched OS's back to production. This does not always work as cleanly as we would like due to handholding needed for some databases and custom software. But the big benefit is that we stopped "Patching and Praying" with our production servers a LONG time ago when we got burnt by WindowsUpdate.

Since we are trying to use more Linux at work, I wanted to see about getting a similar automatic update going for the production->test->production maneuver. Either I'll use vmware or Xen, but I would like to try the snapshotting offered by LVM.

I hope the above helps explain what I'm asking because if someone tells me one-more-time that "repcacheman does not need to be ran on the clients," I'm gonna cyberscream!!!! :Razz: 

----------

## killercow

Okay, 

Here's my post again, i know i could read the entire thread again, and see what had been achieved by now, but could the following questions/functions be performed by a next version of this? (or should this be a seperate program)

Im wondering if this program could be used to do the following:

Create a single point for portage to store all of the synced files. Probably mounted trough NFS/SAMBA, or queried by a special portage clone.

Create a single server/machine who would handle compiling programms (possible helped by other machines trough distcc and ccache)

Create a way to handle different arch and make options on one server.

This would be the sweetest option available for things like clusters/schools/internet cafe's.

Since most of these situations can't have:

their nodes compile things for them selves,

And have their nodes configured in a couple of different ways.

And Ususally have a couple of nodes per config/arch set.

And only need one portage cache.

This would imply that any machine in the lan could just do an emerge or emerge world -up, and it would go query the shared portage tree for its packages depening on tis own config set.

It would then ask the http_proxy thingy for the binay packages for its arch and make config.

The proxy would then either give that package, so it can be installed, Or it could start compiling it (maybe with the help of other machines (including the requestor) trought distcc)

Since this would also imply that some packages will be build with different use/make flags but would not be different (eg the make fags don't apply) it would be handled the simpest by using a large +2GB? ccache.

Any other thoughts on this?

I would use it for my cluster consisting of 5 servers all configured the same., 1 other server, which only has two other make flags.

And one completely different server, (different arch, and setup, because its the fileserver)

This would help my cluster to get managed the way a setup like that should, (just call the emerge world -up command on every machine, and im set,)

And if i need a special package on ony of the nodes, it could just emerge it, and it would get recorded in that node's world file.

----------

## assaf

LOL! An internet cafe running gentoos...

----------

## flybynite

 *killercow wrote:*   

> 
> 
> Im wondering if this program could be used to do the following:
> 
> 

 

No.   In fact , I would say that no one program is ever going to do all these things.  Everything you ask can be done today but will require you to do some coding/scripting.......

My best guesses: 

If the nodes can't compile anything than they probably should be thin clients and just netboot an image off the server.  Gentoo can be used for the server and building the different images.  Probably best for internet cafe's...

Most nodes that can do _some_  limited compiling can use distcc.  Distcc can be configured so no compiling is done on the localhost (some has to be done on localhost, but not much).  I do the custom packages on my laptops this way.  The common packages are binaries served through http-replicator.

Did you know that different arch's can still use Distcc:

 using amd 64 for 32 bit distcc compiles

The same chroot solution can compile on remote hosts without distcc: 

Compiling by Proxy: Rsync Edition

So are you willing to try and do this or are you just looking to "emerge" a total solution?

----------

## killercow

 *flybynite wrote:*   

>  *killercow wrote:*   
> 
> Im wondering if this program could be used to do the following:
> 
>  
> ...

 

Im willing to help develop it.

My questions are not supposed to yield a emerge-able sollution as i know there is no such thing. They are meant to stir up discussion about progams like yours, and what their future could be.

I think i descibed a few scenarios in which it would be good to have a sollution as i discribed.

Internet cafe's, schools, offices, cluster nodes could all make use of one program to handle the following things:

Emerge new programs, and Emerge updates (like any normal gentoo setup)

But with the following distinct advantages;

Nodes get their packages from a master "emerge" server, which hands out binaries based on the requesting arch,make,use flags.

Nodes could be participating in the actual compiling if the master emerge server asks them to trough distcc (speeding up the building of packages)

Nodes could install own packages, which comes in handy in offices (since not every system should be the same, and thus the system remains flexible (unline net-boots))

Each node can run different hardware with optimized software, (unlike net-boots)

One server handles all outgoing http traffic (no more overhead from different systems trying to sync/download package, this is were http-replicator comes in)

Nodes could still issue a simple "net-emerge world -up" to update.

Nodes could still issue a simple "net-emerge package*** " to install a package.

The question is, is this a view i share with others, or am i on my own, and do you think gentoo is not meant to be run on larger mixed lans

As i reccon it makes administrating a lot of homo/heterogenous lans a lot simpler, and faster since they all run gentoo.

Here's a mockup:

http://www.innerheight.com/portage.pdfLast edited by killercow on Wed May 25, 2005 10:11 am; edited 1 time in total

----------

## killercow

 *assaf wrote:*   

> LOL! An internet cafe running gentoos...

 

And why is this not doable?

It would give you a secure platform, with exactly the apps you'd like.

Many internet cafe systems in use today use expensive kiosk software wrapping around IE, on top of a OS which i not really needed for anything but still needs dozens of updates are care to keep working.

----------

## assaf

 *killercow wrote:*   

>  *assaf wrote:*   LOL! An internet cafe running gentoos... 
> 
> And why is this not doable?
> 
> It would give you a secure platform, with exactly the apps you'd like.
> ...

 

Exactly. It's funny to think about an internet cafe sysadmin that heard about gentoo, let alone skilled enough to maintain it. It's more likely that the internet cafe owner will hire the cheapest sysadmin, and that sysadmin will install redhat or windows...

----------

## killercow

 *assaf wrote:*   

>  *killercow wrote:*    *assaf wrote:*   LOL! An internet cafe running gentoos... 
> 
> And why is this not doable?
> 
> It would give you a secure platform, with exactly the apps you'd like.
> ...

 

10 years ago, people said the same thing about internet cafe's in general.

Cafe;s with internet access LOL! (why would anyone sit behind a computer in a cafe?)

I think gentoo could be run fine in Internet cafe's if it had a system like i discribed above. And thus i think we should make effort to create that system.

----------

## NightMonkey

 *assaf wrote:*   

>  *killercow wrote:*    *assaf wrote:*   LOL! An internet cafe running gentoos... 
> 
> And why is this not doable?
> 
> It would give you a secure platform, with exactly the apps you'd like.
> ...

 

I think you are slagging a big group of people with too broad a brush. And not everyone can start their SysAdmin career working for IBM, Sun or NASA. I was in Budapest in 2001, and what I saw "Internet Cafe" admins doing there was quite advanced, and really squeezing out the Mhz from older hardware, too. In Amsterdam, there were even spiffier options at some of the "mega cafes" there. And in my home town, many "Internet Cafe" admins are helping to make my whole city a free wireless hotspot.

Anyhow, Open Source, and high tech in general, grows not by people slagging the projects of others, but by incorporating great ideas from all projects, no matter how "mighty" or "small". Please think twice before belittiling others work or career choices, especially on a public forum.

And no, I don't admin a cafe network...  :Wink: 

----------

## assaf

 *NightMonkey wrote:*   

>  *assaf wrote:*    *killercow wrote:*    *assaf wrote:*   LOL! An internet cafe running gentoos... 
> 
> And why is this not doable?
> 
> It would give you a secure platform, with exactly the apps you'd like.
> ...

 

Hey, j/k  :Wink: 

----------

## killercow

So, 

Back on topic (not really The topic, but my topic)

What would be needed to create a system like mine? and who would be interested in it?

I'd personally love it for my clusters, and maybe even for my collection of home pc's.

Offcource i could share processing power and build time with my friends who might not have the same horse power at home as i do. (since my machine is on 24/7 i could allow my friend to compile on my machine trough this system at night)

----------

## flybynite

Many thanks to  Maurice van der Pot, who on  02 Jun 2005, added http-replicator as an offical Gentoo package!!

See http://packages.gentoo.org/ebuilds/?http-replicator-3.0

Current users wishing to switch to the official ebuild should check to howto at the start of this thread.  It has been updated to reflect the official ebuild.

The changes in the official version are minimal and will not affect your current http-replicator setup.

----------

## NightMonkey

 *flybynite wrote:*   

> Many thanks to  Maurice van der Pot, who on  02 Jun 2005, added http-replicator as an offical Gentoo package!!
> 
> See http://packages.gentoo.org/ebuilds/?http-replicator-3.0
> 
> Current users wishing to switch to the official ebuild should check to howto at the start of this thread.  It has been updated to reflect the official ebuild.
> ...

 

Congrats, flybynite. I've been using your code for over a year now, and it just rocks, and keeps the good Gentoo mirror admins from blocking my NAT'd IP with several boxes behind it.  :Smile:  Thank you!

----------

## javac16

I want to pass on my congrats as well.  I have been using http-replicator for a long time and really like it.  I have just upgraded to the portage version.

----------

## javac16

I have recently begun having some issues on my http-replicator server box.  My client machines can connect to it no problem, but the box itself doesn't seem to be able to download anything it just hangs.  I have changed mirrors a number of times, with no luck and have restarted http-replicator.

Running a netstat -at shows that the connection to the mirror site is successful but it just doesn't seem to download.  I can successfully download when I do not go through the http-replicator proxy.  Any ideas where I should look?

```

22 Jun 2005 18:49:13 STAT: HttpServer 41 bound to gentoo.chem.wisc.edu

22 Jun 2005 19:04:13 STAT: HttpClient 42 bound to 192.168.2.166

22 Jun 2005 19:04:13 INFO: HttpClient 42 proxy request for http://gentoo.chem.wisc.edu/gentoo/distfiles/netkit-ftp-0.17.tar.gz

22 Jun 2005 19:04:14 STAT: HttpServer 42 bound to gentoo.chem.wisc.edu

22 Jun 2005 19:19:15 STAT: HttpClient 43 bound to 192.168.2.166

22 Jun 2005 19:19:15 INFO: HttpClient 43 proxy request for http://gentoo.chem.wisc.edu/gentoo/distfiles/netkit-ftp-0.17.tar.gz

22 Jun 2005 19:19:15 STAT: HttpServer 43 bound to gentoo.chem.wisc.edu

22 Jun 2005 19:34:18 STAT: HttpClient 44 bound to 192.168.2.166

22 Jun 2005 19:34:18 INFO: HttpClient 44 proxy request for http://gentoo.chem.wisc.edu/gentoo/distfiles/netkit-ftp-0.17.tar.gz

22 Jun 2005 19:34:18 STAT: HttpServer 44 bound to gentoo.chem.wisc.edu

22 Jun 2005 19:49:22 STAT: HttpClient 45 bound to 192.168.2.166

22 Jun 2005 19:49:22 INFO: HttpClient 45 proxy request for http://gentoo.chem.wisc.edu/gentoo/distfiles/netkit-ftp-0.17.tar.gz

22 Jun 2005 19:49:22 STAT: HttpServer 45 bound to gentoo.chem.wisc.edu

22 Jun 2005 21:00:20 STAT: HttpClient 46 bound to 192.168.2.166

22 Jun 2005 21:00:20 INFO: HttpClient 46 proxy request for http://gentoo.chem.wisc.edu/gentoo/distfiles/netkit-ftp-0.17.tar.gz

```

----------

## Quincy

Is it a bug or a feature.....

I have set up the portage version of http-replicator and it works fine. Saves me a lot of time and bandwidth or copying files around by hand.

But i think there is a problem with resuming downloads. Is it impossible for http-replicator to see a partly downloaded file and start resuming it? I'm asking because i was downloading the new kde version via ISDN and my connection got interrupted after 10MB of a package. I reconnected, and wget showed correctly: Partial contect...resuming, but then it gets stuck in this position without doing anything obvious. My Internet connection is fully loaded by http-replicator loading some stuff.

Has it started the file over again? Is there a solution for this "problem"?

----------

## flybynite

 *Quincy wrote:*   

> 
> 
> Is it impossible for http-replicator to see a partly downloaded file and start resuming it? I'm asking because i was downloading the new kde version via ISDN and my connection got interrupted after 10MB of a package.

 

http-replicator only supports resuming on the client side.  If the complete file is in the cache, clients can resume downloads.  

If http-replicator doesn't have the complete file in the cache, how would it know it?  How does apache know if the web pages it serves are broken?  The answer is they don't know.  They can only serve what the admin has provided.  If incomplete files get into replicators cache, replicator will still serve them.

If http-replicator knows the file is incomplete it will delete it from the cache.  This happens if you start a long download and then restart or shutdown http-replicator.

If you really want to resume your download, temp remove the http_proxy from your /etc/make.conf and let portage resume it from one of the mirrors.  Not all mirrors support resuming so you may still lose what you have.  Then follow below to put the files back into replicator.

I wrote the repcacheman script to be able to verify downloads are complete.  In the current verson you can move all the files from replicators cache to the distfile dir and then run repcacheman.  It will checksum the files, and move the good files back to the cache dir.

This should do it and delete the junk left over if you use the default dir's.

```

mv /var/cache/http-replicator/* /usr/portage/distfiles

repcacheman

rm /usr/portage/distfiles/*

```

In the future:

1.  http-replicator may support resuming it's downloads.

2.  repcacheman may automatically checksum the cache whenever it is run

3.  portage may be smart enough to work with replicator?????

----------

## flybynite

 *javac16 wrote:*   

> I have recently begun having some issues on my http-replicator server box.  My client machines can connect to it no problem, but the box itself doesn't seem to be able to download anything it just hangs. 

 

Have you been able to fix this yet?

----------

## Quincy

 *flybynite wrote:*   

> 
> 
> In the future:
> 
> 1.  http-replicator may support resuming it's downloads.
> ...

 

I hope it will be last one....downloading for every single machine is annoying and needless.

So i'm glad about http-replicator  :Very Happy: 

----------

## javac16

 *flybynite wrote:*   

>  *javac16 wrote:*   I have recently begun having some issues on my http-replicator server box.  My client machines can connect to it no problem, but the box itself doesn't seem to be able to download anything it just hangs.  
> 
> Have you been able to fix this yet?

 

No and haven't found any more clues to the problem either.  I have currently stopped investigating due to a lack of time (and just download from the mirrors directly again - I only have 2 boxes)...any ideas?

----------

## hothead

Hi,

just set up http-replicator and it is working here apart from that it refuses to fetch the kernel sources.

Can someone help here?

```
root@workstation [/home] ACCEPT_KEYWORDS=~x86 emerge -f =vanilla-sources-2.6.12

Calculating dependencies ...done!

>>> emerge (1 of 1) sys-kernel/vanilla-sources-2.6.12 to /

>>> Resuming download...

>>> Downloading http://gentoo.blueyonder.co.uk/distfiles/linux-2.6.12.tar.bz2

--20:55:48--  http://gentoo.blueyonder.co.uk/distfiles/linux-2.6.12.tar.bz2

           => `/usr/portage/distfiles/linux-2.6.12.tar.bz2'

Auflösen des Hostnamen »localhost«.... 127.0.0.1

Verbindungsaufbau zu localhost[127.0.0.1]:8080... verbunden.

Proxy Anforderung gesendet, warte auf Antwort... 200 OK

Länge: nicht spezifiziert

    [ <=>                                                                                           ] 0             --.--K/s

20:55:48 (0.00 B/s) - `/usr/portage/distfiles/linux-2.6.12.tar.bz2' saved [0]

>>> Resuming download...

>>> Downloading http://www.us.kernel.org/pub/linux/kernel/v2.6/linux-2.6.12.tar.bz2

--20:55:48--  http://www.us.kernel.org/pub/linux/kernel/v2.6/linux-2.6.12.tar.bz2

           => `/usr/portage/distfiles/linux-2.6.12.tar.bz2'

Auflösen des Hostnamen »localhost«.... 127.0.0.1

Verbindungsaufbau zu localhost[127.0.0.1]:8080... verbunden.

Proxy Anforderung gesendet, warte auf Antwort... 200 OK

Länge: nicht spezifiziert

    [ <=>                                                                                           ] 0             --.--K/s

20:55:48 (0.00 B/s) - `/usr/portage/distfiles/linux-2.6.12.tar.bz2' saved [0]

!!! Couldn't download linux-2.6.12.tar.bz2. Aborting.

!!! Fetch for /usr/portage/sys-kernel/vanilla-sources/vanilla-sources-2.6.12.ebuild failed, continuing...

!!! Some fetch errors were encountered.  Please see above for details.

root@workstation [/home]       
```

----------

## flybynite

Can you translate this: Länge: nicht spezifiziert 

I'm thinking it's Requested Range Not Satisfiable or similiar.

If so, follow the instructions in this post:

https://forums.gentoo.org/viewtopic-t-173226-postdays-0-postorder-asc-start-347.html

----------

## hothead

'Länge nicht spezifiziert' means 'Size not specified'.

I solved my problem - it was really easy.

There were some files with zero size in the cache. 

(I once copied the http-replicator directory to a separate partition to 

resize the the old one. But the backup partition was a bit to small - thats

how the zero files emerged.)

I had to manually delete all zero files after that it worked.

Is it possible to make http-replicator aware of this problem?

If it detects a zero file it may promt to delete in order to redownload the file.

Regards

hothead

----------

## flybynite

 *hothead wrote:*   

> 
> 
> Is it possible to make http-replicator aware of this problem?
> 
> If it detects a zero file it may promt to delete in order to redownload the file.
> ...

 

replicator needs to be distro neutral and I don't think having too much inside the server is a good idea.  What is in the works is to have repcacheman manage the cache and delete any files that fail the md5 check.

Till then you can move the cache to the distfile dir and then repcacheman will md5 check the files and move the good ones back to the cache.

----------

## assaf

Since i've upgraded to gentoo-sources-2.6.12-r4 i've been having a problem with http-replicator not starting correctly.

During boot up the http-replicator line has an 'OK' beside it, but the port remains closed until I manually restart the service.

Also when I restart it, the stop part has an 'OK' beside it but prints something like: 'no running http-replicator found none killed' .

Afterwards it works fine. Same thing happened on both my machines, and looks like same thing is happening with the hddtemp service. Anyone else?

----------

## javac16

 *javac16 wrote:*   

>  *flybynite wrote:*    *javac16 wrote:*   I have recently begun having some issues on my http-replicator server box.  My client machines can connect to it no problem, but the box itself doesn't seem to be able to download anything it just hangs.  
> 
> Have you been able to fix this yet? 
> 
> No and haven't found any more clues to the problem either.  I have currently stopped investigating due to a lack of time (and just download from the mirrors directly again - I only have 2 boxes)...any ideas?

 

I just rebuilt my http-replicator box (I wanted to change the partitions and uninstall a bunch of stuff -- figured I might as well just start a new) and I am no longer experiencing the issue.  I am guessing that a different issue on my box was causing the problem, though I don't know what.

----------

## flybynite

 *assaf wrote:*   

> Since i've upgraded to gentoo-sources-2.6.12-r4 i've been having a problem with http-replicator not starting correctly.

 

kernel 2.6.12-r4 has a known networking bug.  Try upgrading....

----------

## assaf

 *flybynite wrote:*   

>  *assaf wrote:*   Since i've upgraded to gentoo-sources-2.6.12-r4 i've been having a problem with http-replicator not starting correctly. 
> 
> kernel 2.6.12-r4 has a known networking bug.  Try upgrading....

 

Ok, thanks. I'll try 2.6.12-r6.

----------

## assaf

 *assaf wrote:*   

>  *flybynite wrote:*    *assaf wrote:*   Since i've upgraded to gentoo-sources-2.6.12-r4 i've been having a problem with http-replicator not starting correctly. 
> 
> kernel 2.6.12-r4 has a known networking bug.  Try upgrading.... 
> 
> Ok, thanks. I'll try 2.6.12-r6.

 

That didn't help. Same problem with r6.

----------

## flybynite

 *assaf wrote:*   

> 
> 
> That didn't help. Same problem with r6.

 

2.6.12-r6 works for me.  Maybe you changed something else beside the kernel?

```

gate1 ~ # uname -a

Linux gate1 2.6.12-gentoo-r6 #1 SMP Wed Jul 20 14:18:23 CDT 2005 i686 Pentium II (Deschutes) GenuineIntel GNU/Linux

gate1 ~ # /etc/init.d/http-replicator restart

 * Stopping Http-Replicator ...                                                                      [ ok ]

 * Starting Http-Replicator ...                                                                       [ ok ]

gate1 ~ # ps aux | grep http-replicator

portage   8511  0.0  1.1   6816  4368 pts/5    S    22:58   0:00 /usr/bin/python /usr/bin/http-replicator -s -f --pid /var/run/http-replicator.pid --daemon --dir /var/cache/http-replicator --user portage --alias /usr/portage/packages/All:All --alias /usr/src/:src --log /var/log/http-replicator.log --debug --ip 192.168.*.* --ip 10.*.*.* --port 8080

root      8517  0.0  0.1   1616   476 pts/5    S+   22:58   0:00 grep http-replicator

gate1 ~ # emerge -f ufed

Calculating dependencies ...done!

>>> emerge (1 of 1) app-portage/ufed-0.36 to /

>>> Downloading http://gentoo.osuosl.org/distfiles/ufed-0.36.tar.bz2

--23:03:08--  http://gentoo.osuosl.org/distfiles/ufed-0.36.tar.bz2

           => `/usr/portage/distfiles/ufed-0.36.tar.bz2'

Resolving gate1.homenet.com... 127.0.0.1

Connecting to gate1.homenet.com[127.0.0.1]:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: 12,997 [application/x-tar]

100%[================================================================================>] 12,997        29.72K/s

23:03:09 (29.63 KB/s) - `/usr/portage/distfiles/ufed-0.36.tar.bz2' saved [12,997/12,997]

>>> ufed-0.36.tar.bz2 size ;-)

>>> ufed-0.36.tar.bz2 MD5 ;-)

>>> md5 files   ;-) ufed-0.39.ebuild

>>> md5 files   ;-) ufed-0.36.ebuild

>>> md5 files   ;-) ufed-0.40_pre2.ebuild

>>> md5 files   ;-) files/digest-ufed-0.40_pre2

>>> md5 files   ;-) files/digest-ufed-0.36

>>> md5 files   ;-) files/digest-ufed-0.39

>>> md5 src_uri ;-) ufed-0.36.tar.bz2

gate1 ~ #     

```

----------

## carpman

Hello got following error when starting http-replicator but ebuild have been moved after repcacheman

```

/etc/init.d/http-replicator start

 * Caching service dependencies ...                                                  [ ok ] * Starting Http-Replicator ...

usage: http-replicator [options]

http-replicator: error: no read/write permission for directory '/httprep'  

```

also getting following when running emerge on server box

```

Connecting to caxton.michael.co.uk[192.168.1.2]:8080... failed: Connection refused.

>>> Downloading http://cudlug.cudenver.edu/gentoo/distfiles/sandbox-1.2.11.tar.bz2

--19:43:35--  http://cudlug.cudenver.edu/gentoo/distfiles/sandbox-1.2.11.tar.bz2

           => `/usr/portage/distfiles/sandbox-1.2.11.tar.bz2'

Resolving caxton.michael.co.uk... 192.168.1.2

Connecting to caxton.michael.co.uk[192.168.1.2]:8080... failed: Connection refused.

```

----------

## houtworm

http-replicator wil not start anymore :-/

Perhaps it has something to do with a system crash (had to use the power switch) of with an update.

a http-replicator.pid file is created in /var/run

this is what i see at the commandline (with the --debug option active in .conf):

# /etc/init.d/http-replicator start

 * Starting Http-Replicator ...

 * Failed to start Http-Replicator                                             [ !! ]

The http-replicator logfile:

02 Aug 2005 22:44:32 INFO: HttpReplicator started

02 Aug 2005 22:44:32 INFO: HttpReplicator terminated

daemon.log: 

Aug  2 22:44:32 cbw rc-scripts: Failed to start Http-Replicator

help! 

What can i try to find the error?

----------

## houtworm

 *houtworm wrote:*   

> 
> 
> # /etc/init.d/http-replicator start
> 
>  * Starting Http-Replicator ...
> ...

 

Problem solved!!

it was baselayout-1.12.0_pre3-r2   :Sad: ((

so back to the old baselayout and http-replicator works again  :Smile: )

----------

## dalek

I have a question.  When I run repcacheman, it seems to delete some of the files in my /usr/portage/distfiles directory that I still need.  This is some of what I get when I run it but there is a whole lot more of them.

```
Verifying checksum's....

WARNING qcad-2.0.3.1-1.src.tar.gz is not in portage!!!

WARNING kdegames-3.2.3.tar.bz2 is not in portage!!!

WARNING xorg-x11-6.8.2-files-0.1.tar.bz2 is not in portage!!!

WARNING NVIDIA-Linux-x86-1.0-7664-pkg0.run is not in portage!!!

WARNING mozilla-launcher-1.34.bz2 is not in portage!!!

****** SNIP *******

SUMMARY:

Found 1486 duplicate file(s).

        Deleted 1486 dupe(s).         <--------------------SAY WHAT!!!!!!  

Found 493 new file(s).

        Added 0 of those file(s) to the cache.

        Rejected 4 corrupt or incomplete file(s).

        489 Unknown file(s) that are not listed in portage

        You may want to delete them yourself....

Done!

```

If I tell it to emerge -ef world after running that, it will have to download over 1GB of stuff that I just had.  I did make a backup before running this.  I'm on dial-up and that would take a week of 24/7 downloading to get.  I would likely just order the CDs and copy them over.

This is my /etc/conf.d/http-replicator file:

```
## Set the cache dir

GENERAL_OPTS="--dir /var/cache/http-replicator"

## Change UID/GID to user after opening the log and pid file.

## 'user' must have read/write access to cache dir:

GENERAL_OPTS="$GENERAL_OPTS --user portage"

## Don't change or comment this out:

DAEMON_OPTS="$GENERAL_OPTS"

## Do you need a proxy to reach the internet?

## This will forward requests to an external proxy server:

## Use one of the following, not both:

#DAEMON_OPTS="$DAEMON_OPTS --external somehost:1234"

#DAEMON_OPTS="$DAEMON_OPTS --external username:password@host:port"

## Local dir to serve clients.  Great for serving binary packages

## See PKDIR and PORTAGE_BINHOST settings in 'man make.conf'

## --alias /path/to/serve:location will make /path/to/serve

## browsable at http://http-replicator.com:port/location

DAEMON_OPTS="$DAEMON_OPTS --alias /usr/portage/packages/All:All"

## Dir to hold the log file:

DAEMON_OPTS="$DAEMON_OPTS --log /var/log/http-replicator.log"

## Make the log messages less and less verbose.

## Up to four times to make it extremely quiet.

#DAEMON_OPTS="$DAEMON_OPTS --quiet"

#DAEMON_OPTS="$DAEMON_OPTS --quiet"

## Make the log messages extra verbose for debugging.

#DAEMON_OPTS="$DAEMON_OPTS --debug"

## The ip addresses from which access is allowed. Can be used as many times

## as necessary. Access from localhost is allowed by default.

DAEMON_OPTS="$DAEMON_OPTS --ip 192.*.*.*"

DAEMON_OPTS="$DAEMON_OPTS --ip 10.*.*.*"

## The proxy port on which the server listens for http requests:

DAEMON_OPTS="$DAEMON_OPTS --port 8080"
```

Did I do something wrong or is there a way to tell it to stop deleting my files?  I sure am glad I made those backups.    :Rolling Eyes: 

Thanks

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## Quincy

Are there files in "/var/cache/http-replicator"

Normally the program deletes every file from /usr/portage/distfiles which is alreade in the cache directory. Emerge wants to "download" them if they aren't in the distfiles folder, but they will only be downloaded locally...

----------

## dalek

Well, I did find them in /var/cache/http-replicator later on but it wanted to download them from the net, NOT locally.  Should I tell my main box that acts as the server to look at itself for the files first instead of going to the net?

This is weird.  I'm getting to like the idea of just putting /usr/portage on NFS and sharing it that way.  That way it shares the snapshot for syncing and the distfiles are right there as well.  It also only has to download once regardless of which machine needs it.  It also saves on disk space because it is only stored once for ALL machines.

 :Confused:   :Confused:   :Confused: 

----------

## Quincy

It is stored once for all machines this way, too, because there is only one replicator cache.

And yes...you can tell the main box to load from itself via the proxy setting. In reality it just copies the files on hdd but it "thinks" its coming from the net.

The advantage to NFS is that the systems are independent of each other and work without the main box as well.

----------

## dalek

On seperate lines like this:

```
http_proxy="http://192.167.0.1:8080"

GENTOO_MIRRORS="ftp://ftp.gtlib.cc.gatech.edu/pub/gentoo http://csociety-ftp.ecn.purdue.edu/pub/gentoo/ ftp://csociety-ftp.ecn.purdue.edu/pub/gentoo/ "

```

or one line like this:

```

GENTOO_MIRRORS="http_proxy="http://192.167.0.1:8080" ftp://ftp.gtlib.cc.gatech.edu/pub/gentoo http://csociety-ftp.ecn.purdue.edu/pub/gentoo/ "
```

Just checking.  I shortened the mirror list to make the post shorter.  Oh, I ssh into all the other rigs anyway so if my main rig is not running, I can't ssh in anyway.  They are set up as servers with no mouse, keyboard or monitor, unless it has a problem booting or something then I hook one up to see what's up with it.

Thanks for the help.  Me was   :Confused:   :Confused:  .  Don't worry though, it is normal for me.  I sort of have my lady on my mind anyway.    :Rolling Eyes:    That ain't helping much either.

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## Quincy

Like every other client....this solution   :Wink: 

 *dalek wrote:*   

> On seperate lines like this:
> 
> ```
> http_proxy="http://192.167.0.1:8080"
> 
> ...

 

----------

## dalek

 *Quincy wrote:*   

> Like every other client....this solution  
> 
>  *dalek wrote:*   On seperate lines like this:
> 
> ```
> ...

 

I went for broke and just tried it.  It worked.  I had backups though, two seperate ones I might add.    :Shocked: 

I still want to know why it deletes them though.  It even deletes the most recent stuff that I am currently using.  It sounds strange to remove something that is needed and has nothing wrong with it.  If it failed the md5 thing, that would be OK and would be understandable.  It is funny that it removes it then when I do a emerge -f world, it just copies it right back again.  If I do the command again, it deletes them again.  < scratches head >

This a bug or like windoze a feature?    :Rolling Eyes: 

Later

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## Quincy

The "feature" is when running repcacheman on the Box which has the cache, all files already in the cache are deleted in /var/cache/http-replicator because they are duplicates of anything in the cache. This is a feature because the file can be downloaded everytime from the computer itself.

Some of your packages fail md5 check because e.g. kdegames-3.2.3 has left portage some time ago.

Because of that:

Run repcacheman, check the remaining files in /usr/distfiles/portage and delete by hand this old stuff. You can't miss anything thats in portage after running repcacheman...never had problems with that, pehaps because of my conventional, stable settings.

----------

## dalek

I see some of what you are saying but, if I do a emerge -efv world, it downloads a LOT of the files it just deleted.  If you want, I can do it again and list all the ones it should not delete for you to see.  Some may be old but would emerge -efv world download a old package like kde-3-2-*?

I do need to clean out some old clutter.  There was a program to do that but I can't find it anymore.  My search skills suck.    :Embarassed: 

Later

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## Quincy

No, emerge -efv world should download all the things which are in world....perhaps you have all this old stuff still installed?

Perhaps try emerge -Pp world (-P = Prune --> Look for unneeded packages, but do not uninstall them all and unseen...there are always important packages listet)

----------

## dalek

I made a new thread, maybe they won't move it to /dev/null.  Let's clutter up that thread instead.

https://forums.gentoo.org/viewtopic-p-2833974.html#2833974

Thanks

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## carmen

to have a 'decentralized' build environment - any client that builds something instantly making the binary and source pkg available to everyone else..you need to do this (i think):

- run http-replicator on each client, and set its proxy to itself (and/or run repcacheman every time) to ensure any downloaded packages end up in cache

- on each client machine, add all the other machines as source/binhosts to make.conf (and edit all machines whenever you add one)

- youre still wasting disk space by having the copies eventually exist on almost all clients, sometimes 2x in distfiles and cache dir

FEATURES=replication could solve all this (one centralized server, any clients upload binary-pkg via http. no more client-cache needed or server on every machine or crazily-bloating client lists pointing to eachother, and also eliminate the need for thinking about repcacheman). or is there already a way to do a distributed binhost type situation? I THINK this thing is great and portage should be aware of it in other words  :Smile: 

----------

## carpman

Hello what script would recommend for cleaning out duplicate files in httprep dir?

have seen a few scripts that either are no longer maintained or read installed package files to decide what to delete, which of course is no good for httprep as it holds apps downloaded for many machines.

cheers

----------

## flybynite

 *carpman wrote:*   

> 
> 
> have seen a few scripts that either are no longer maintained or read installed package files to decide what to delete

 

This is a tough problem for every gentoo user, not just http-replicator users.

Do you want to remove:

1. packages not in portage (might still be installed as a dependant)

2. packages not installed?

3. packages not accessed in a certain period of time?

4. superceeded packages ( see #1)

5 etc, etc.

My advice is pick one that you like based on your prefs and disk space.

Most scripts can be easily modified to clean the cache dir instead of the distfile dir.

If you don't want to modify the scripts, just move the cached files to the distfile dir (mv /var/cache/http-replicator/* /usr/portage/distfiles/ ) and then run your script and then repcacheman.  repcacheman will check the files and then move them back to the cache.  mv'ing files in the same partition are just a rename so it is fast.

Here are a couple places to look for scripts:

First the FAQ:

https://forums.gentoo.org/viewtopic-t-30547.html

Some Gentoo Experimental scripts:

http://gentooexperimental.org/script/repo/search?categories_mod%5B0%5D=is&category%5B0%5D=cleaners&srch=true

Tmpwatch is in portage and looks like it will work:

* app-admin/tmpwatch

     Available versions:  2.9.2.2 2.9.4.1

     Installed:           none

     Homepage:            http://download.fedora.redhat.com/pub/fedora/linux/core/development/SRPMS/

     Description:         Utility recursively searches through specified directories and removes files which have not been accessed in a specified period of time.

Of course there are numerous scripts that can delete based on last access time, like this which delete files not accessed in 30 days:

find /var/cache/http-replicator -type f -atime +30 -exec rm -f "{}" \;

See man find if you want to delete files based on other criteria.

Hope this helps....

----------

## carpman

suppose what am looking for is superceeded files, the only problem here is if i am running ~z86 system on lan, which i am for testing GCC4 reiser4.

Will have a look links, cheers

----------

## dalek

Well, I cleaned mine out this way.  I have http-repl* set up here.  I made a back-up copy of /usr/portage/distfiles just to be safe.  I'm on dial-up and wouldn't want to loose to much.  I already had run repcacheman to get it up to date.  Then I rm -rfv /usr/portage/distfiles to remove them all.  I then did a emerge -efv world to reload all the ones it would need to rebuild all packages installed.  It took it a bit but it did.  There was about 1.5GBs that went missing.  

After I knew I had all the files that I would ever need, I then deleted the ones I had copied for back-up.  That should work.  If the distfiles directory is empty, it will download only the ones it has to have then.

Don't forget to put the proxy line in your make.conf.  Point it back at yourself and it works well.

I'm not going to tell you how many times I hit the tab key to type in those commands and locations above.    :Embarassed:   :Laughing:  It's a habit because I type pretty bad.    :Shocked:   :Crying or Very sad: 

That make sense?

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## carpman

 *dalek wrote:*   

> Well, I cleaned mine out this way.  I have http-repl* set up here.  I made a back-up copy of /usr/portage/distfiles just to be safe.  I'm on dial-up and wouldn't want to loose to much.  I already had run repcacheman to get it up to date.  Then I rm -rfv /usr/portage/distfiles to remove them all.  I then did a emerge -efv world to reload all the ones it would need to rebuild all packages installed.  It took it a bit but it did.  There was about 1.5GBs that went missing.  
> 
> After I knew I had all the files that I would ever need, I then deleted the ones I had copied for back-up.  That should work.  If the distfiles directory is empty, it will download only the ones it has to have then.
> 
> Don't forget to put the proxy line in your make.conf.  Point it back at yourself and it works well.
> ...

 

Take you need to do emerge -efv world  on every machine on network?

This is the problem with many existing scripts, they assume that you only want to keep files in that machines world/pkg list, but as i have httprep on headless server all the gui apps would be deleted.

I could do via versions but again i if i am running different spec boxs, amd64, x86, ~x86, then i would need different versions of apps.

All a bit of a pain really, looks like i will best be served by date of file, so if not used or changed in 6 months delete, but this again would cause problems for apps not update often.

What would be nice is for script to read world file on all machines network and then clean out httprep dir.

----------

## dalek

If you do emerge -efv world, from what I have read and what I have done, it downloads everything needed to reinstall everything on the system, not just the packages in your world file.  That includes the dependancies of packages in your world file.  Mine fetches 515 files for my install.  

Maybe you have something unique but I am doing my second install doing this and I have yet to have to download anything since I am installing what I have currently on this system.  I have 4 computers here, some run as servers some as desktops, emerge -efv woirld works just fine for me.  After I cleaned everything out, it copied everything I needed back to /usr/portage/distfiles and is working fine.

Like I said, maybe you have something way unique, but it works fine for me.  If you have something that unique, back-ups may be a good idea unless you can redownload pretty easy.  If I had DSL or something I would have just deleted the files and let it redownload it all that way.  It would take me a week on dial-up though.

It was a idea.

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## todw1fd

Had this problem before some time ago.  But just did a fresh install and now seeing same thing again.  During startup, Http Replicator show as started with and OK, and there will be a pid listed in the pid file.  Replicator log shows service as started.  But clients cannot connect.  When you run "/etc/init.d/http-replicator status" response shows service started.  When you run "/etc/init.d/http-replicator stop" you're informed there is not an instance running:

 *Quote:*   

> gentoo-1 log # /etc/init.d/http-replicator status
> 
>  * Caching service dependencies ...                                                                                    [ ok ] * status:  started
> 
> gentoo-1 log # /etc/init.d/http-replicator stop
> ...

 

After either doing a start or a zap and start, the service then functions and shows as running with a ps.  Clients can then connect. Tried starting from local.start also with same results.   Any ideas?

----------

## todw1fd

Seems to start and stop fine manually as long as I don't have it trying to start automatically first in the default runlevel or from local.start.  Not quite understanding this behavior.

----------

## flybynite

 *todw1fd wrote:*   

> Seems to start and stop fine manually as long as I don't have it trying to start automatically first in the default runlevel or from local.start

 

Are you running ~x86?

The only time I've seen this is when replicator is unable to open the port.  Maybe problem with startup scripts or config file?

You can simulate the auto startup by

```

source /etc/conf.d/http-replicator

/usr/bin/http-replicator

```

Which may give an error such as:

http-replicator: error: port 8080 is not available

If it is a config file problem you can also check what options are set by sourcing the config file and echo'ing the opts which should look similiar to this.:

```

source /etc/conf.d/http-replicator

echo $GENERAL_OPTS

```

which should give:

```

--dir /var/cache/http-replicator --user portage

```

Then try :

```

echo $DAEMON_OPTS

```

Which should give:

```

--dir /var/cache/http-replicator --user portage --alias /usr/portage/packages/All:All --alias /usr/src/:src --log /var/log/http-replicator.log --debug --ip 192.168.*.* --ip 10.*.*.* --port 8080

```

----------

## todw1fd

Thanks Flyby.  I'll give it a shot when I get back to the system at home.

----------

## todw1fd

 *flybynite wrote:*   

>  *todw1fd wrote:*   Seems to start and stop fine manually as long as I don't have it trying to start automatically first in the default runlevel or from local.start 
> 
> Are you running ~x86?
> 
> No, running x86
> ...

 

Yup, except missing your second alias /usr/src/:src

Doesn't want to complain after I'm up and running.   Just doesn't want to cooperate using the startup scripts for some reason.

----------

## NightMonkey

Apologies if this has been dealt with earlier in this thread, but Search doesn't really work too well within long threads. Today, I ran repcacheman, and got this nice note:

```
# repcacheman

Replicator's cache directory: /opt/http-replicator/

Portage's DISTDIR: /usr/portage/distfiles/

Comparing directories....

Done!

New files in DISTDIR:

jdk-1_5_0_05-linux-i586.bin

jce_policy-1_5_0.zip

Checking authenticity and integrity of new files...

Searching for ebuilds's ....

Done!

Found 21866 ebuilds.

Extracting the checksums....

Traceback (most recent call last):

  File "/usr/bin/repcacheman.py", line 162, in ?

    digestpath = os.path.dirname(digestpath)+"/files/digest-"+pv

  File "/usr/lib/python2.4/posixpath.py", line 119, in dirname

    return split(p)[0]

  File "/usr/lib/python2.4/posixpath.py", line 77, in split

    i = p.rfind('/') + 1

AttributeError: 'NoneType' object has no attribute 'rfind'
```

Any pointers on where to find the source of this problem? Let me know if you need more information. http-replicatior has worked solidly (and silently) for over a year for me, so its record isn't bad! Thanks in advance.  :Smile: 

----------

## dahoste

Ditto NightMonkey's error.  Saw it for the first time today (11/30) -- but only on one of my gentoo servers.  The other one got through 'emerge --update --deep world' and 'repcacheman' just fine.  Haven't chased the problem yet - hoping that someone familiar with http-replicator's guts has an easy solution.  Also, as with most things gentoo, this might go away in 24 hours.   :Smile: 

Note: the only files in my distfiles dir (when attempting to run 'repcacheman') are the following:

pciutils-2.2.0.tar.gz

nfs-utils-1.0.6.tar.gz

pci.ids-20051015.bz2

----------

## dahoste

Nevermind.  Problem magically went away (as hoped).  No problem with repcacheman today.  Odd, but life's too short to worry about it.

----------

## carpman

Hello, ok i am having a persistant problem with emerging 3 apps this has been going on for about a week with an emerge sync once a day.

The problem is that the source files cannot be found on servers, apps concerned

```

media-libs/jbigkit-1.4

media-libs/jasper-1.701.0

media-libs/netpbm-10.30-r1

```

If i by pass my httprep server they download and emerge fine, any ideas

----------

## flybynite

 *NightMonkey wrote:*   

> Apologies if this has been dealt with earlier in this thread, but Search doesn't really work too well within long threads. Today, I ran repcacheman, and got this nice note:
> 
> 

 

This usually happens because a gentoo developer doesn't input all the correct info into an ebuild.  repcacheman doesn't deal well with this.

I've considered beefing up the error correction but put it off, because like you said, it goes away when the developer fixes their mistake and the problem is actually hard to fix.

Well, the error isn't hard to fix, but if you can't match all files to their md5sum in portage, how can you know what files are bad and what is caused by the portage error?  It may be better just to wait for the portage update than to try and guess???

I have to do some more research on the different type of portage errors to know for sure.....

----------

## flybynite

 *carpman wrote:*   

> Hello, ok i am having a persistant problem with emerging 3 apps this has been going on for about a week with an emerge sync once a day.
> 
> If i by pass my httprep server they download and emerge fine, any ideas

 

Yes!  This is most likely caused by a rare but possible remote server error.  The file in replicator cache is corrupted.  This shouldn't happen but it does in a few rare cirumstance that I have NEVER been able to reproduce!

repcacheman is the fix.  Move all files from replicator cache into the distfile dir and then run repcacheman.  All files are checksummed and then moved back to the cache.  Any files left behind are the corrupted versions and will be deleted by portage or delete them yourself.

This post has a full explanation and the fix:

https://forums.gentoo.org/viewtopic-t-173226-postdays-0-postorder-asc-start-105.html

----------

## dalek

Is there any way to tell it not to delete my java tarball that I download from Sun?  It would be nice if it would just move it to the cache and then copy it back if it needs it but it seems determined to delete the thing every time I run repcacheman.  Then I get a error and can not use the && thing since it errors out and won't continue with the next command.

Bug, feature, me nuts?????    :Laughing:   :Embarassed: 

It only does it on this one file though.

Later

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## flybynite

 *dalek wrote:*   

> Is there any way to tell it not to delete my java tarball that I download from Sun? 

 

I guess I'm not exactly sure what the problem is?

If I understand you than it's portage that's the problem, not replicator or repcacheman.

```

gate1 ~ # mv /home/tom/j2sdk-1_4_2_10-linux-i586.bin  /usr/portage/distfiles/

gate1 ~ # repcacheman

Replicator's cache directory: /var/cache/http-replicator/

Portage's DISTDIR: /usr/portage/distfiles/

Comparing directories....

Done!

New files in DISTDIR:

j2sdk-1_4_2_10-linux-i586.bin

Checking authenticity and integrity of new files...

Searching for ebuilds's ....

Done!

Found 21857 ebuilds.

Extracting the checksums....

Done!

Verifying checksum's....

/usr/portage/distfiles/j2sdk-1_4_2_10-linux-i586.bin

MD5 OK

SUMMARY:

Found 0 duplicate file(s).

        Deleted 0 dupe(s).

Found 1 new file(s).

        Added 1 of those file(s) to the cache.

        Rejected 0 corrupt or incomplete file(s).

Done!

```

replicator can verify the md5sum and add the file to the cache.  That's it job.  So far so good....

Now the problem is portage will NEVER, EVER   do it's job and try to download the file again!!!   The file is ready and waiting in the cache, but the call never comes.....

Look up RESTRICT="fetch" in the gentoo docs for more info on this problem.

I"ve asked gentoo dev's to change this or make an exception possible.  It hasn't happened yet....  Please feel free to file a bug also, the more the merrier....

So gentoo says you gotta download the file yourself, repcacheman doesn't stop you.  By moving the file into the cache, all boxes on your net can download the file from the cache.

You have to outsmart portage if you want to sidestep the gentoo restriction.  You can copy the ebuild to the overlay dir and comment out the "fetch" restriction and all will be fine.  If you find yourself having to reinstall the binary this is the best solution.

----------

## dalek

I see what you mean now.  That is nasty of portage ain't it?  I do keep a seperate copy laying around though.  That way I know I don't have to spend 3 hours or so downloading it again.

In one sense, it is actually Suns fault.  They want you to accept their stupid licence thing to get it.  If the others would work correctly, I would use them instead of Suns java.  I have tried them but they were causing me to lock up on some sites.  So I am stuck with Suns silly restrictions.  Since you got me straight on that, may as well move the blame on up the food chain.    :Rolling Eyes: 

Thanks

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## lannocc

I occassionally run into the following problem when doing an emerge with my http_proxy pointing at a server running http-replicator. Portage tries to download the file a few times and then gives up, reporting the following after each attempt:

```
>>> Deleting invalid distfile. (Improper 404 redirect from server.)
```

Previously my solution was to temporarily comment out the http_proxy variable in make.conf and run the emerge again on the client machine. Today however I was browsing through this thread and ran into the situation again so I decided to try something different.

On the server running http-replicator I moved the affected file (in this case linux-2.6.14.tar.bz2, from gentoo-sources) from /var/cache/http-replicator back into /usr/portage/distfiles. I then ran repcacheman and 'lo and behold it tells me the file is corrupt:

```
Replicator's cache directory: /var/cache/http-replicator/

Portage's DISTDIR: /usr/portage/distfiles/

Comparing directories....

Done!

New files in DISTDIR:

linux-2.6.14.tar.bz2

Checking authenticity and integrity of new files...

Searching for ebuilds...

Done!

Found 22291 ebuilds.

Extracting the checksums....

Done!

Verifying checksum's....

/usr/portage/distfiles/linux-2.6.14.tar.bz2

CORRUPT or INCOMPLETE 

SUMMARY:

Found 0 duplicate file(s).

        Deleted 0 dupe(s).

Found 1 new file(s).

        Added 0 of those file(s) to the cache.

        Rejected 1 corrupt or incomplete file(s).

Done!
```

That's good news, because at least now I know why my emerge was failing (corrupted file). So my first question is this: how did the corrupted file end up in http-replicator's cache in the first place?. From what I understand reading this thread, there are safeguards to prevent that, no?

Secondly (multi-part), is it http-replicator that is returning the 404 redirect? I know what a 404 is and what a redirect is, but what is a 404 redirect? Is there a different http error status that http-replicator could return in this instance that might make more sense?

----------

## flybynite

 *lannocc wrote:*   

>  So my first question is this: how did the corrupted file end up in http-replicator's cache in the first place?
> 
> 

 

See the answer about half a page up...  If anyone can reproduce this error, I'm still looking...

https://forums.gentoo.org/viewtopic-t-173226-start-392.html

 *lannocc wrote:*   

> 
> 
> Secondly (multi-part), is it http-replicator that is returning the 404 redirect? 

 

No

----------

## lannocc

 *Quote:*   

> If anyone can reproduce this error, I'm still looking... 

 

I wonder, how much work would it be to add the checksum code from repcacheman to http-replicator, so that when a file is requested it does an integrity check on it first? If integrity fails, it should remove the file and download it again as though it were never there. Would performing the integrity check significantly slowdown local (cached) downloads?

----------

## flybynite

 *lannocc wrote:*   

> 
> 
> I wonder, how much work would it be to add the checksum code from repcacheman to http-replicator, so that when a file is requested it does an integrity check on it first?

 

The problem with this is that http-replicator would then be gentoo specific.  replicator started out as and still is a debian (.deb) cache and will work on many platforms.

Keeping the gentoo specific code in repcacheman makes sense.  To solve the problem you've had, I planned on adding code to repcacheman to check replicators cache, not just the distfile dir.

The same thing can be done just by running repcacheman like this:

```

mv /var/cache/http-replicator/* /usr/portage/distfiles/

repcacheman

```

This is simple, cheap if both dir's are on the same filesystem, and a whole lot more flexible.

For example,  I and probably others use replicator to cache other files besides portage files.  I can't have repcacheman just start deleting files because there not in portage.

But in typical linux fashion, there are more ways to do this,  maybe just some more detail in the howto would work too.....

----------

## fcw

 *lannocc wrote:*   

> 
> 
> Secondly (multi-part), is it http-replicator that is returning the 404 redirect? I know what a 404 is and what a redirect is, but what is a 404 redirect? Is there a different http error status that http-replicator could return in this instance that might make more sense?

 

I got this same error message from emerge after I made a local web mirror available; it happened whenever the file requested was not available locally.  

It turns out that the web server was configured to present a normal-looking friendly page with links to various places in place of a bare 404 error message.  This is fine for users, but apparently not for whatever portage uses (wget?), which can't make sense of the response.  

By altering just the HTTP status code on the friendly page to 404, while leaving the body of the page the same for end-users, emerge stopped presenting the '404 redirect' message.

----------

## carpman

Hello, ok have asked this before but got no reply so wil try again as stil having problem.

The issue is that some files will not download when using httpreplicator, if i comment it out in make.conf  then files download fine?

I have created /etc/portage/mirrors

```

emerge (2 of 97) media-libs/netpbm-10.30-r1 to /

>>> Downloading http://gentoo.blueyonder.co.uk/distfiles/netpbm-10.30-manpages.tar.bz2

--11:34:44--  http://gentoo.blueyonder.co.uk/distfiles/netpbm-10.30-manpages.tar.bz2

           => `/usr/portage/distfiles/netpbm-10.30-manpages.tar.bz2'

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: 233,254 (228K) [application/x-tar]

100%[===================================================================>] 233,254       --.--K/s

11:34:44 (2.62 MB/s) - `/usr/portage/distfiles/netpbm-10.30-manpages.tar.bz2' saved [233254/233254]

>>> Downloading http://gentoo.blueyonder.co.uk/distfiles/netpbm-10.30.tgz

--11:34:44--  http://gentoo.blueyonder.co.uk/distfiles/netpbm-10.30.tgz

           => `/usr/portage/distfiles/netpbm-10.30.tgz'

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: unspecified

    [ <=>                                                                ] 0             --.--K/s

11:34:44 (0.00 B/s) - `/usr/portage/distfiles/netpbm-10.30.tgz' saved [0]

>>> Resuming download...

>>> Downloading http://gentoo.blueyonder.co.uk/distfiles/netpbm-10.30.tgz

--11:34:44--  http://gentoo.blueyonder.co.uk/distfiles/netpbm-10.30.tgz

           => `/usr/portage/distfiles/netpbm-10.30.tgz'

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: unspecified

    [ <=>                                                                ] 0             --.--K/s

11:34:44 (0.00 B/s) - `/usr/portage/distfiles/netpbm-10.30.tgz' saved [0]

>>> Resuming download...

>>> Downloading http://mirror.ovh.net/gentoo-distfiles/distfiles/netpbm-10.30.tgz

--11:34:44--  http://mirror.ovh.net/gentoo-distfiles/distfiles/netpbm-10.30.tgz

           => `/usr/portage/distfiles/netpbm-10.30.tgz'

Connecting to 192.168.1.3:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: unspecified

    [ <=>                                                                ] 0             --.--K/s

```

----------

## troubleticket

Hello everyone,

having had a short thread about the following item in "forums.gentoo.org" 

https://forums.gentoo.org/viewtopic-p-3001502.html#3001502 and also this posting on http://gentoo-wiki.com/index.php?title=Talk:HOWTO_Download_Cache_for_LAN-Http-Replicator I again try it here:

To me it seems, that all software not installed on the installation server (http-replicator) itself, will not be updatetd to the cache until the respective client does the emerge.

My aim is to anyhow update only the cache e.g. once a night after an

```
emerge sync
```

on the server (feeding the "Local Rsync Mirror") and afterwards have an consistent state for all the different clients, so that all subsequent updates from any client (except intallation of a completely new packcage never insatalled before on any other machine) can run without internet connect.

In other words: I don't think that the world file on the installation server/cache is being affected by the clients' software updates so an 

```
emerge -uD world
```

only on the cache-machine would not do what I want.

Or am I missing something ... ???

----------

## flybynite

 *carpman wrote:*   

> Hello, ok have asked this before but got no reply so wil try again as stil having problem.
> 
> 

 

Asked and answered.   I guess you missed it....

Here is your original post and the second reply down is the answer quoting your problem...

https://forums.gentoo.org/viewtopic-t-173226-postdays-0-postorder-asc-start-390.html

----------

## flybynite

 *troubleticket wrote:*   

> 
> 
> To me it seems, that all software not installed on the installation server (http-replicator) itself, will not be updatetd to the cache until the respective client does the emerge.
> 
> Or am I missing something ... ???

 

Sorry I don't follow the wiki.  This thread is the only support I give.

Having read all your posts, If I understand correctly, you want your nightly connect to the internet to have everything downloaded for all your different computers ready for them to perform the updates without needing to connect to the net again???

If so, what you want can be done easily, just not how you think it can.

Each individual computer can only tell what it needs to update.  No other box can know that.  That is the way portage/gentoo works.

But here is how to do what you want.

```

emerge sync

emerge -f world

```

Run this from a cron on your local rsync server first, then 5 minutes later, run it on all other computers you have.

This will have all your computers ready to be updated without connecting to the net again.  The local rsync server and http-replicator will ensure that only 1 copy of ALL data (rsync and packages) will come from the internet, the rest will come from the cache.

----------

## carpman

 *flybynite wrote:*   

>  *carpman wrote:*   Hello, ok have asked this before but got no reply so wil try again as stil having problem.
> 
>  
> 
> Asked and answered.   I guess you missed it....
> ...

 

Apologies that i missed it, thing is i seem to get this sort of thing n regular basis!

I did as sugested and moved all files in http-rep dir to /usr/portage/distfiles and and ran repcacheman which resulted in:

```

SUMMARY:

Found 0 duplicate file(s).

        Deleted 0 dupe(s).

Found 737 new file(s).

        Added 682 of those file(s) to the cache.

        Rejected 31 corrupt or incomplete file(s).

        24 Unknown file(s) that are not listed in portage

        You may want to delete them yourself....

```

Now thing is how do i know which ones are corrupt and which ones are not?

Will watch things and see what occurs.

----------

## flybynite

 *carpman wrote:*   

> 
> 
> Now thing is how do i know which ones are corrupt and which ones are not?
> 
> 

 

unless you use replicator to cache other downloads, delete everything left in /usr/portage/distfiles/ after running repcacheman.  I don't have repcachman do this by default because some users will have replicator cache files outside of portage, plus it takes root to delete those files and repcacheman doesn't run as root by default.

If you look closer at the output of repcacheman, it told you which files are ok, corrupt, or not in portage!!

```

/usr/portage/distfiles/digikam-0.8.0.tar.bz2

MD5 OK

WARNING xine-lib-patches-21.tar.bz2 is not in portage!!!

/usr/portage/distfiles/libkexif-0.2.2.tar.bz2

MD5 OK

```

----------

## troubleticket

 *flybynite wrote:*   

> 
> 
> Having read all your posts, If I understand correctly, you want your nightly connect to the internet to have everything downloaded for all your different computers ready for them to perform the updates without needing to connect to the net again???
> 
> If so, what you want can be done easily, just not how you think it can.
> ...

 

First:

Thank you very much for your time, your effort, your work ... your answer.

Second:

Your suggestion assumes me to be "boss" of all of the boxes on my lan, or if not, all boxes to be up and running all night long. Neither one meets my reality.

I had hoped there could be a solution like:

"Look along all my cache's content, extract the packages' "basenames" and feed some simple script with the result"

on the server ...

But it seems ... <sigh>

----------

## F.Ultra

First of all, many, many thanks to Flybynite for this excellent tool! 

As I understand it http-replicator do not handle FTP-requests, and I wonder how much work it would be to add this functionality? If I am not to misinformed, a ftp_proxy means that wget would send a http request to the proxy with a ftp:// type of request embedded in the http request so the communication between the client and server (http-replicator) would this be the same, the "only" thing would be that http-replicator now had to fetch the file using ftp instead of http. Using wget in http-replicator for this task would perhaps work quite easily with the only side-effect that the file would have to be downloaded in full before the data could be streamed to the clients. Or maybe I am dead wrong here  :Very Happy: 

Another thing is the mirrors, it would be very nice if one could simply insert a single mirror in make.conf on the clients with say GENTOO_MIRRORS="http://127.0.0.1", and let http-replicator choose a mirror from its own local make.conf when receiving such a request. In this way one would have a single point of mirror configuration (at the http-replicator server) which would be far easier than maintaining the mirror list on some 1000 clients  :Shocked: 

This was just my two thoughts on an otherwise brilliant piece of software!

----------

## flybynite

 *F.Ultra wrote:*   

> 
> 
> As I understand it http-replicator do not handle FTP-requests, and I wonder how much work it would be to add this functionality?
> 
> Another thing is the mirrors, it would be very nice if one could simply insert a single mirror in make.conf on the clients with say GENTOO_MIRRORS="http://127.0.0.1", and let http-replicator choose a mirror from its own local make.conf

 

Correct on all counts. 

ftp has been in the works for a while now.  I've got sample code to do ftp requests.

What you may not know is that http-replicator started out as a general http and debian specific cache.  I worked with the primary developer to get gentoo specific features and http-replicator has taken off and become a supported package under gentoo.   I don't know if the project was ever accepted by debian.

Gertjan is the primary developer, http://gertjan.freezope.org/replicator/  as stated in the ebuild.  He is a debian user, not a gentoo user.

I worked with Gertjan to start these changes quite a while ago.  I gave him working ftp code and he had some good ideas on choosing mirrors.  He wouldn't add these features as a simple upgrade but only as a part of a complete rewrite.  I haven't heard anything lately so I assume the project bogged down in the complete rewrite.  His changelog shows version 3.1 in work as of may 2005. http://gertjan.freezope.org/replicator/changelog/

You may want to drop him a line and see if you can rekindle his interest in http-replicator.  It's probably hard for him to add gentoo specific features when he is a debian user....

----------

## Inhale

I'm using http-replicator 3. 

When emerging, the client machine's first wget to a local mirror always returns a file not found 404 - the subsequent wget to a second mirror is successful. This is regardless of file being requested. 

I confirm that the first mirror and file exists - using wget and the URL pasted from the output on the http-replicator proxy. The same occurs for any cache miss.

 # emerge gentoo-vdr-scripts

Calculating dependencies ...done!

>>> emerge (1 of 2) app-admin/sudo-1.6.8_p9-r2 to /

>>> Resuming download...

>>> Downloading ftp://ftp.citylink.co.nz/gentoo/distfiles/sudo-1.6.8p9.tar.gz

--11:48:56--  ftp://ftp.citylink.co.nz/gentoo/distfiles/sudo-1.6.8p9.tar.gz

           => `/home/portage/distfiles/sudo-1.6.8p9.tar.gz'

Resolving luke... 172.17.28.33

Connecting to luke|172.17.28.33|:8080... connected.

Proxy request sent, awaiting response... 404 Not Found

11:48:56 ERROR 404: Not Found.

>>> Resuming download...

>>> Downloading http://gentoo.osuosl.org/distfiles/sudo-1.6.8p9.tar.gz

--11:48:56--  http://gentoo.osuosl.org/distfiles/sudo-1.6.8p9.tar.gz

           => `/home/portage/distfiles/sudo-1.6.8p9.tar.gz'

Resolving luke... 172.17.28.33

Connecting to luke|172.17.28.33|:8080... connected.

Proxy request sent, awaiting response... 206 Partial Content

Length: 585,509 (572K), 568,853 (556K) remaining [application/x-gzip]

-

 2% [+                                    ] 16,656        --.--K/s

>>> Update: Doh! My first mirror is an FTP site which current http-replicator doesn't cache - changed URI to HTTP protocol and it works   :Embarassed: 

----------

## F.Ultra

flybynite: Many thanks for your info. I will have a look into this but as you say getting a Debian developer do specific Gentoo stuff is probably a dead end. It is a realy pity that I do not know python/bash enough to to these changes myself but perhaps I will get my thumbs out and do a Gentoo specific fork in C. Probably not since I will probably focus more into doing a fully centralized managment system instead of having each machine be managed separately.

----------

## Inhale

Non-issue - if you see the last line (the one starting with D'oh) 'the technical department have located the source of the problem (points to self) and are endevouring correct the issue'  :Smile: 

- I remember reading somewhere that the next version (a major re-write) will also cache FTP URIs so this would have gone away on upgrade anyway..

----------

## Haakon

First, let me not be thankless and thank you for your efforts in creating and maintaining http-replicator as well as putting up with people like me and their problems.  :Very Happy: 

I did a quick scan of the recent posts, but I haven't seen anything like I am experiencing with http-replicator.  I admit, I did a fast scan so I could have missed it.

Here is a --debug log which shows the error I am encountering:

```

15 Apr 2006 04:00:21 INFO: HttpReplicator started

15 Apr 2006 04:00:51 STAT: HttpClient 1 bound to 192.168.x.4

15 Apr 2006 04:00:51 ERROR: HttpClient 1 caught an exception, closing socket

Traceback (most recent call last):

  File "/usr/lib/python2.4/asyncore.py", line 69, in read

    obj.handle_read_event()

  File "/usr/lib/python2.4/asyncore.py", line 391, in handle_read_event

    self.handle_read()

  File "/usr/bin/http-replicator", line 156, in handle_read

    self.data.write(chunk) # append received data

  File "/usr/lib/python2.4/asyncore.py", line 366, in __getattr__

    return getattr(self.socket, attr)

AttributeError: '_socketobject' object has no attribute 'data'

```

And this is what the client sees during the fetch part of the emerge:

```

>>> emerge (2 of 9) sys-libs/timezone-data-2006a to /

>>> Downloading http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

--05:21:40--  http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

           => `/usr/portage/distfiles/tzdata2006a.tar.gz'

Connecting to 192.168.x.1:8080... connected.

Proxy request sent, awaiting response... No data received.

Retrying.

--05:21:41--  http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

  (try: 2) => `/usr/portage/distfiles/tzdata2006a.tar.gz'

Connecting to 192.168.x.1:8080... failed: Connection refused.

>>> Downloading http://gentoo.osuosl.org/distfiles/tzdata2006a.tar.gz

--05:21:41--  http://gentoo.osuosl.org/distfiles/tzdata2006a.tar.gz

           => `/usr/portage/distfiles/tzdata2006a.tar.gz'

Connecting to 192.168.x.1:8080... failed: Connection refused.

>>> Downloading http://www.ibiblio.org/pub/Linux/distributions/gentoo/distfiles/tzdata2006a.tar.gz

--05:21:41--  http://www.ibiblio.org/pub/Linux/distributions/gentoo/distfiles/tzdata2006a.tar.gz

           => `/usr/portage/distfiles/tzdata2006a.tar.gz'

Connecting to 192.168.x.1:8080... failed: Connection refused.

>>> Downloading ftp://elsie.nci.nih.gov/pub/tzdata2006a.tar.gz

--05:21:41--  ftp://elsie.nci.nih.gov/pub/tzdata2006a.tar.gz

           => `/usr/portage/distfiles/tzdata2006a.tar.gz'

Resolving elsie.nci.nih.gov... 137.187.215.78

Connecting to elsie.nci.nih.gov|137.187.215.78|:21... connected.

Logging in as anonymous ...

Exiting on signal 2

```

At the end here on "signal 2" is where I broke the emerge process due to the error and trying to troubleshoot.  The last mirror you see there, the ftp site, is the only one that seems to download anything when I use http-replicator.  I have tried different port numbers above 10000 and I have also tried dropping my firewall to ensure that was not causing a problem.

Any help you can provide would be greatly appreciated.  I have over 20 computers at home, all of which have Gentoo.  Currently, I tend to copy the dist diretory from one machine to the next to keep from having to download all the source files each time I update a machine.  This little utility you have written seems to be a godsend.  :Smile: 

----------

## flybynite

 *Haakon wrote:*   

> 
> 
> ```
> 
> 15 Apr 2006 04:00:21 INFO: HttpReplicator started
> ...

 

Did you censor the log, or do you really have 192.168.X.4 in the config file?

----------

## Haakon

 *flybynite wrote:*   

> 
> 
> Did you censor the log, or do you really have 192.168.X.4 in the config file?

 

Sorry about that, I should have said as much in the first place.  I censored the third octet because I didn't think it was needed for the problem at hand.

----------

## flybynite

Ok, just gotta check the obvious first. I don't mean anything by it.  I also have to ask if your running ~x86?

It seems python is having trouble opening a socket.  Try re-emerging python.

Re-emerge python anyway, but have you udated the kernel/kernel headers, glibc or gcc lately?  Of course your running gentoo, so the answer is probably yes to more than one  :Smile:     A list of versions for the above and python might be helpful also.  Last time I saw something similiar, the re-emerge fixed it, but I expect some api change to hit anytime now....

```

emerge -va1 python

```

----------

## Haakon

 *flybynite wrote:*   

> Ok, just gotta check the obvious first. I don't mean anything by it.  I also have to ask if your running ~x86?
> 
> It seems python is having trouble opening a socket.  Try re-emerging python.
> 
> Re-emerge python anyway, but have you udated the kernel/kernel headers, glibc or gcc lately?  Of course your running gentoo, so the answer is probably yes to more than one     A list of versions for the above and python might be helpful also.  Last time I saw something similiar, the re-emerge fixed it, but I expect some api change to hit anytime now....
> ...

 

Yup, I am running a ~x86 and using Gentoo.  I didn't read into anything you said.  :Smile:   On the machine in question it is a Sempron64 2600+, Asus K8N motherboard, and 512MB of RAM.  I don't really think the video card has anything to do with this, but it is a Voodoo3.  The hard drive is an 80GB Maxtor on PATA.

glibc = version 2.3.5-r3

gcc = version 3.4.5-r1

gentoo-sources = 2.6.15-r1

linux-headers = 2.6.11-r2

Using profile 2006.0

I did a "emerge --sync" about two weeks ago.  After the sync, I did a "emerge --update --deep --newuse world"

I am running the 2.6.15-r1 kernel.  It was created using "genkernel --menuconfig all".  I ensured the proper initial RAM disk was married up with that kernel in my boot loader.  Here is that section from lilo.conf:

```

image=/boot/kernel-genkernel-x86-2.6.15-gentoo-r1

        label=default

        read-only

        root=/dev/ram0

        append="video=vesafb:1024x768-16@75 init=/linuxrc ramdisk=8192 real_root=/dev/hda1 udev"

        initrd=/boot/initramfs-genkernel-x86-2.6.15-gentoo-r1

```

I did see a suggestion about remerging the python package, and I did that, but I did it again just for S&G's and because you asked.  :Very Happy: 

Here is the first line before it asked to continue with the python emerge"

```

[ebuild   R   ] dev-lang/python-2.4.2  -X +berkdb -bootstrap -build -doc +gdbm -ipv6 +ncurses -nocxx +readline +ssl -tcltk -ucs2 0 kB

```

And here is the last few lines after the emerge finished.  You will also see the version there:

```

 * Byte compiling python modules for python-2.4 .. ...                                                                                                 [ ok ]

 *

 * If you have just upgraded from an older version of python you will need to run:

 *

 * /usr/sbin/python-updater

 *

 * This will automatically rebuild all the python dependent modules

 * to run with python-2.4.

 *

 * Your original Python is still installed and can be accessed via

 * /usr/bin/python2.x.

 *

>>> Regenerating /etc/ld.so.cache...

>>> dev-lang/python-2.4.2 merged.

>>> clean: No packages selected for removal.

>>> Auto-cleaning packages ...

>>> No outdated packages were found on your system.

 * GNU info directory index is up-to-date.

localhost ~ # python-updater

 * Can't determine any previous Python version(s).

localhost ~ #

```

I did give it a name other than "localhost".  Just another censor.

Oh, and I tried the python-updater....  :Smile: 

If you want, I can sync and update all the packages, however, I am making the assumption we don't want to potentially make more problems than we have already.  :Smile: 

Thanks again for all your help and efforts.

----------

## flybynite

 *Haakon wrote:*   

> 
> 
> glibc = version 2.3.5-r3
> 
> gcc = version 3.4.5-r1
> ...

 

Ok, your running stable gentoo (not ~x86) and just so happens I have the exact same versions of everything, right down to the kernel version!!  (What are the odds?)   So you haven't found any new api changes.

You don't need to rebuild your whole sytem, yet  :Smile:   Although it might be a good idea if you have the time since you may have recently upgraded to gcc 3.4.5.  Is this a fresh install with gcc 3.4.5 or did you do the upgrade yourself?  If so, did you do the 'emerge -e' complete rebuild upgrade or the revdep-rebuild upgrade from this howto?

http://www.gentoo.org/doc/en/gcc-upgrading.xml#upgrade-3.3-to-3.4

I tried the revdep-rebuild upgrade and found a few packages later that revdep-rebuild upgrade missed so I did the complete rebuild later.

The next step is to check your config.  Try the tests in the following post:

https://forums.gentoo.org/viewtopic-t-173226-postdays-0-postorder-asc-start-384.html

----------

## Haakon

 *flybynite wrote:*   

> 
> 
> Ok, your running stable gentoo (not ~x86) and just so happens I have the exact same versions of everything, right down to the kernel version!!  (What are the odds?)   So you haven't found any new api changes.
> 
> You don't need to rebuild your whole sytem, yet   Although it might be a good idea if you have the time since you may have recently upgraded to gcc 3.4.5.  Is this a fresh install with gcc 3.4.5 or did you do the upgrade yourself?  If so, did you do the 'emerge -e' complete rebuild upgrade or the revdep-rebuild upgrade from this howto?
> ...

 

Well, found one problem... it would seem my gcc profile is stuck on i386-pc-linux-gnu-3.3.6.  I changed that to i386-pc-linux-gnu-3.4.5 and I am following the steps of that howto.  I will be running an "emerge -eD world" next.  I will try http-replicator after all that and see if that works.

I had no idea I was supposed to do all that.  I just let emerge do its thing and come back hours later when it is done.  I apologize for my noobishness.   :Embarassed:    I also stick to the i386 side so that I can tarball the entire system up and clone it off to another system as needed.  I can always change the make.conf later to customize to that particular machine.  Yes, I do have systems that old...  :Smile:   Hey they work, so why not use them for simple tasks like a home file server?   :Very Happy: 

Oh, revdep-rebuild didn't like one missing link, but since I don't have X on my system, I didn't see the problem and neither did revdep.  :Smile: 

```

localhost X11 # revdep-rebuild

Configuring search environment for revdep-rebuild

Checking reverse dependencies...

Packages containing binaries and libraries broken by a package update

will be emerged.

Collecting system binaries and libraries... done.

  (/root/.revdep-rebuild.1_files)

Collecting complete LD_LIBRARY_PATH... done.

  (/root/.revdep-rebuild.2_ldpath)

Checking dynamic linking consistency...

  broken /usr/lib/X11/xkb/xkbcomp (requires  libX11.so.6 libxkbfile.so.1)

 done.

  (/root/.revdep-rebuild.3_rebuild)

Assigning files to ebuilds... done.

  (/root/.revdep-rebuild.4_ebuilds)

Evaluating package order... done.

  (/root/.revdep-rebuild.5_order)

Dynamic linking on your system is consistent... All done.

localhost X11 #

```

I am not sure how to permanently remove that link(/usr/lib/X11/xkb/xkbcomp), as I have removed the package, via "emerge --unmerge", and used equery to make sure it was all gone and removed the packages that persisted.  The orignal tarball included X which I didn't need for this machine so I killed it and did the "emerge -eD world" to freshen the system after those changes and the changes to the make.conf to exclude X.  However, I am suspecting this is a completely different issue and has nothing to do with why http-replicator isn't working.  I am only providing some insight to the system and how it was created.

I will post back when the tasks have finished and after I tried http-replicator again.

----------

## flybynite

 *Haakon wrote:*   

> 
> 
> Well, found one problem... it would seem my gcc profile is stuck on i386-pc-linux-gnu-3.3.6.  
> 
> 

 

Well, due to a blooper by a dev the first ebuild of gcc 3.4.x may have caused some packages to build with the new 3.4.x compiler which could have made those packages inconsistent.  The ebuild was quickly fixed.  If you just blindly upgraded to 3.4.x nothing would have happened with the fixed ebuild.  It depends when you upgraded to know if you were bitten by this blooper.

Might as well finish the upgrade to be sure.

 *Haakon wrote:*   

> 
> 
> Oh, revdep-rebuild didn't like one missing link, but since I don't have X on my system, I didn't see the problem and neither did revdep. 
> 
> ```
> ...

 

Don't over think this.  This orphan file can just be deleted which will keep revdep-rebuild from complaining.

```

rm /usr/lib/X11/xkb/xkbcomp

```

----------

## Haakon

Well, the rebuild didn't work.  :Sad:   I have the exact same error as before.

As for the configuration file for http-replicator, it is as the default with the exceptions of the --debug option enabled and the class A private IP range being disabled.

Is it possible the client is actually the one with the problem?  I am pretty certain they all have the gcc profile problem.

Also, I don't have a problem rebuilding the server computer.  I have a backup server to provide the services this server provides.  Oh, I won't use my tarball install, I will do it the long way with stage 3.  Want me to do the complete reinstall?

----------

## flybynite

 *Haakon wrote:*   

> 
> 
> As for the configuration file for http-replicator, it is as the default with the exceptions of the --debug option enabled and the class A private IP range being disabled.
> 
> 

 

First things first. Just one step at a time.

Did you run the tests in the post above?  This helps spot typo's by actually printing what replicator would see, not what you see.

----------

## Haakon

Sorry about that.  My mind is racing...  Ok, so the first step from your second URL is this command:

```

source /etc/conf.d/http-replicator

/usr/bin/http-replicator

```

which provided this output:

```

localhost conf.d # source /etc/conf.d/http-replicator

localhost conf.d # /usr/bin/http-replicator

INFO: HttpReplicator started

```

I had pressed enter a few times here because I didn't get my command prompt back after executing the binary.

I pressed ctrl-c and got this and my command prompt back:

```

INFO: HttpReplicator terminated

```

The next two steps:

```

localhost conf.d # echo $GENERAL_OPTS

--dir /var/cache/http-replicator --user portage

localhost conf.d # echo $DAEMON_OPTS

--dir /var/cache/http-replicator --user portage --alias /usr/portage/packages/All:All --log /var/log/http-replicator.log --debug --ip 192.168.x.* --port 8080

localhost conf.d #

```

I noticed I didn't have the second --alias that you have, the "--alias /usr/src/:src" one.

Oh, and of course, the "--ip 10.*.*.*" is missing because I took it out and the "--debug" option is in because I put that in.  :Smile: 

the localhost and x in the ip third octet are censored, however the "x" represents the same number for the client and server.  I have an actual number where "x" sits, same goes for the config file below.

Finally, the /etc/conf.d/http-replicator file:

```

localhost conf.d # cat http-replicator

## Config file for http-replicator

## sourced by init scripts automatically

## GENERAL_OPTS used by repcacheman

## DAEMON_OPTS used by http-replicator

## Set the cache dir

GENERAL_OPTS="--dir /var/cache/http-replicator"

## Change UID/GID to user after opening the log and pid file.

## 'user' must have read/write access to cache dir:

GENERAL_OPTS="$GENERAL_OPTS --user portage"

## Don't change or comment this out:

DAEMON_OPTS="$GENERAL_OPTS"

## Do you need a proxy to reach the internet?

## This will forward requests to an external proxy server:

## Use one of the following, not both:

#DAEMON_OPTS="$DAEMON_OPTS --external somehost:1234"

#DAEMON_OPTS="$DAEMON_OPTS --external username:password@host:port"

## Local dir to serve clients.  Great for serving binary packages

## See PKDIR and PORTAGE_BINHOST settings in 'man make.conf'

## --alias /path/to/serve:location will make /path/to/serve

## browsable at http://http-replicator.com:port/location

DAEMON_OPTS="$DAEMON_OPTS --alias /usr/portage/packages/All:All"

## Dir to hold the log file:

DAEMON_OPTS="$DAEMON_OPTS --log /var/log/http-replicator.log"

## Make the log messages less and less verbose.

## Up to four times to make it extremely quiet.

#DAEMON_OPTS="$DAEMON_OPTS --quiet"

#DAEMON_OPTS="$DAEMON_OPTS --quiet"

## Make the log messages extra verbose for debugging.

DAEMON_OPTS="$DAEMON_OPTS --debug"

## The ip addresses from which access is allowed. Can be used as many times

## as necessary. Access from localhost is allowed by default.

DAEMON_OPTS="$DAEMON_OPTS --ip 192.168.x.*"

#DAEMON_OPTS="$DAEMON_OPTS --ip 10.*.*.*"

## The proxy port on which the server listens for http requests:

DAEMON_OPTS="$DAEMON_OPTS --port 8080"

localhost conf.d #

```

I am pretty sure that is everything.  Let me know if I missed something.

----------

## flybynite

Hmm..  Everything looks good so far from the server side.

Lets start up the server like this.

```

source /etc/conf.d/http-replicator

/usr/bin/http-replicator

```

and then try to see where the problem kicks in.  Is the problem only on one file?  Or only on one http server?

You can test like this from another terminal without using portage.

```

http_proxy='yourproxy:8080' wget http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

```

which is the same file and server of your original problem.

Try it with a few different files and different servers.

```

http_proxy='yourproxy:8080' wget http://gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

```

If it worked so far, the files are in the cache.  Try the above again which will retrieve the files from the cache.

Does it fail from a single box or from another box as well.  Try it from the server box also.

I'd like to see at least one failure from the server side and the client side.  The other important things are can you narrow it down to a single client, file, or http server?

----------

## Haakon

Okies.... I did the command:

```

source /etc/conf.d/http-replicator

/usr/bin/http-replicator

```

on the server.

On the client, a diferent client than the first, I typed this command as you asked:

```

http_proxy='192.168.x.1:8080' wget http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

```

The "x" is still the same variable as it has always been in my censors.  :Smile: 

Now comes the fun part.  This is what the client spat out:

```

localclient etc # http_proxy='192.168.x.1:8080' wget http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

--20:11:24--  http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

           => `tzdata2006a.tar.gz'

Connecting to 192.168.x.1:8080... connected.

Proxy request sent, awaiting response... No data received.

Retrying.

--20:11:25--  http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

  (try: 2) => `tzdata2006a.tar.gz'

Connecting to 192.168.x.1:8080... connected.

Proxy request sent, awaiting response... Read error (Connection reset by peer) in headers.

Retrying.

--20:11:27--  http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

  (try: 3) => `tzdata2006a.tar.gz'

Connecting to 192.168.x.1:8080... connected.

Proxy request sent, awaiting response... Read error (Connection reset by peer) in headers.

Retrying.

```

I broke the command after the 8th try, but I didn't think it was necessary to post all 8 retries of the same thing.  Three spams of the retries is enough.  :Razz: 

Now, at the same time, this is what the server was seeing:

```

localserver conf.d # source /etc/conf.d/http-replicator

localserver conf.d # /usr/bin/http-replicator

INFO: HttpReplicator started

WARNING: HttpReplicator blocked incoming request from 192.168.x.2:39424

WARNING: HttpReplicator blocked incoming request from 192.168.x.2:39425

WARNING: HttpReplicator blocked incoming request from 192.168.x.2:39426

WARNING: HttpReplicator blocked incoming request from 192.168.x.2:39427

WARNING: HttpReplicator blocked incoming request from 192.168.x.2:39428

WARNING: HttpReplicator blocked incoming request from 192.168.x.2:39429

WARNING: HttpReplicator blocked incoming request from 192.168.x.2:39430

WARNING: HttpReplicator blocked incoming request from 192.168.x.2:39431

INFO: HttpReplicator terminated

localserver conf.d #

```

"Localserver", "localclient", and "x" are censors again.  The x.2 ip is the ip of the client that is trying to make the contact.  I can to the same thing with another client and get the same thing, just the last octet changes.

I also tried your other download selection and tried this command:

```

http_proxy='192.168.x.1:8080' wget http://gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

```

This command produces the same output as the first file we tried to download.

At this point, I took it upon my self to have the server download the file of the first command manually with out http-replicator trying to download it:

```

localserver conf.d # wget http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

--19:10:42--  http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

           => `tzdata2006a.tar.gz'

Resolving gentoo.chem.wisc.edu... 128.104.70.13

Connecting to gentoo.chem.wisc.edu|128.104.70.13|:80... connected.

HTTP request sent, awaiting response... 200 OK

Length: 149,612 (146K) [application/x-tar]

100%[========================================================================>] 149,612      121.08K/s

19:10:44 (120.85 KB/s) - `tzdata2006a.tar.gz' saved [149612/149612]

```

I then moved it to the http-replicator cache and gave it the same permissions of all the other files in the cache:

```

localserver conf.d # mv tzdata2006a.tar.gz /var/cache/http-replicator/

localserver conf.d # cd /var/cache/http-replicator/

localserver http-replicator # ls -l tzdata2006a.tar.gz

-rw-r--r--  1 root root 149612 Jan 31 23:01 tzdata2006a.tar.gz

localserver http-replicator # ls -l zip23.tar.gz

-rw-r--r--  1 portage portage 723283 Apr 13 21:18 zip23.tar.gz

localserver http-replicator # chown portage:portage tzdata2006a.tar.gz

localserver http-replicator # ls -l tzdata2006a.tar.gz

-rw-r--r--  1 portage portage 149612 Jan 31 23:01 tzdata2006a.tar.gz

```

I then tried the original command you suggested to download the file "tzdata2006a.tar.gz", but the errors are identical to what we see above for both the client and the server.

The computer that http-replicator is on has two ethernet ports, would that be a problem?  This computer acts as my gateway to the Internet, so ipchains and masquerading is in use.  I will also note, that turning off ipchains didn't change the outcome.  I still got the same errors we see above.

As near as I can see, it doesn't make a difference what server or file combination I use, I will get the same error when I attempted different combinations.Last edited by Haakon on Tue Apr 25, 2006 2:57 pm; edited 1 time in total

----------

## flybynite

oops, so sorry.  My bad...

The error is obvious but not the one your having.  I meant to say start replicator like this:

```

source /etc/conf.d/http-replicator

 /usr/bin/http-replicator $DAEMON_OPTS

```

This will pass the options correctly and should eliminate the security warning.  Please retry the test...

----------

## Haakon

Hehe not a problem.  Do I win Simon Says? j/k   :Laughing: 

Oh, and I will be using the normal censors in the last post.  :Smile: 

Ok, I ran the revised commands as you said:

```

localserver ~ # source /etc/conf.d/http-replicator

localserver ~ # /usr/bin/http-replicator $DAEMON_OPTS

```

I then ran these commands to download three different files from three different mirrors on the client:

```

localclient ~ # http_proxy='192.168.x.1:8080' wget http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

localclient ~ # http_proxy='192.168.x.1:8080' wget http://gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

localclient ~ # http_proxy='192.168.x.1:8080' wget http://distro.ibiblio.org/pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2

```

Now, for the output spam...

Here is what the client had seen:

```

localclient ~ # http_proxy='192.168.x.1:8080' wget http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

--10:03:05--  http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

           => `tzdata2006a.tar.gz.1'

Connecting to 192.168.x.1:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: 149,612 (146K) [application/x-tar]

100%[=======================================================================>] 149,612      134.87K/s

10:03:07 (134.60 KB/s) - `tzdata2006a.tar.gz.1' saved [149612/149612]

localclient ~ # http_proxy='192.168.x.1:8080' wget http://gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

--10:03:21--  http://gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

           => `netcat-110-patches-1.0.tar.bz2.1'

Connecting to 192.168.x.1:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: 25,751 (25K) [application/x-tar]

100%[=======================================================================>] 25,751        54.53K/s

10:03:22 (54.37 KB/s) - `netcat-110-patches-1.0.tar.bz2.1' saved [25751/25751]

localclient ~ # http_proxy='192.168.x.1:8080' wget http://distro.ibiblio.org/pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2

--10:03:27--  http://distro.ibiblio.org/pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2

           => `gcc-3.4.5-patches-1.4.tar.bz2.1'

Connecting to 192.168.x.1:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: 52,228 (51K) [application/x-bzip2]

100%[=======================================================================>] 52,228        25.49K/s

10:03:30 (25.42 KB/s) - `gcc-3.4.5-patches-1.4.tar.bz2.1' saved [52228/52228]

localclient ~ #

```

and the server output from the three command executed on the client:

```

localserver ~ # source /etc/conf.d/http-replicator

localserver ~ # /usr/bin/http-replicator $DAEMON_OPTS

INFO: HttpReplicator started

STAT: HttpClient 1 bound to 192.168.x.2

INFO: HttpClient 1 proxy request for http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

DEBUG: HttpClient 1 cache position: gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

DEBUG: HttpClient 1 connecting to gentoo.chem.wisc.edu

STAT: HttpServer 1 bound to gentoo.chem.wisc.edu

DEBUG: HttpServer 1 received header:

  GET /gentoo/distfiles/tzdata2006a.tar.gz HTTP/1.0

  host: gentoo.chem.wisc.edu

  connection: close

  accept: */*

  user-agent: Wget/1.10.2

INFO: HttpServer 1 serving file from remote host

DEBUG: HttpClient 1 received header:

  HTTP/1.1 200 OK

  content-length: 149612

  content-encoding: x-gzip

  accept-ranges: bytes

  server: Apache/2.0.54 (Gentoo/Linux)

  last-modified: Wed, 01 Feb 2006 08:01:12 GMT

  connection: close

  etag: "837a-2486c-155a4200"

  date: Sat, 22 Apr 2006 18:04:01 GMT

  content-type: application/x-tar

DEBUG: HttpServer 1 closed

STAT: HttpServer 1 sent 149612 bytes

INFO: HttpServer 1 cached gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

DEBUG: HttpClient 1 closed

STAT: HttpClient 1 received 149612 bytes

STAT: HttpClient 2 bound to 192.168.x.2

INFO: HttpClient 2 proxy request for http://gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

DEBUG: HttpClient 2 cache position: gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

DEBUG: HttpClient 2 connecting to gentoo.osuosl.org

STAT: HttpServer 2 bound to gentoo.osuosl.org

DEBUG: HttpServer 2 received header:

  GET /distfiles/netcat-110-patches-1.0.tar.bz2 HTTP/1.0

  host: gentoo.osuosl.org

  connection: close

  accept: */*

  user-agent: Wget/1.10.2

INFO: HttpServer 2 serving file from remote host

DEBUG: HttpClient 2 received header:

  HTTP/1.1 200 OK

  content-length: 25751

  accept-ranges: bytes

  server: Apache

  last-modified: Fri, 06 May 2005 23:03:43 GMT

  connection: close

  etag: "1170b561-6497-1a0251c0"

  date: Sat, 22 Apr 2006 18:04:10 GMT

  content-type: application/x-tar

DEBUG: HttpServer 2 closed

STAT: HttpServer 2 sent 25751 bytes

INFO: HttpServer 2 cached gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

DEBUG: HttpClient 2 closed

STAT: HttpClient 2 received 25751 bytes

STAT: HttpClient 3 bound to 192.168.x.2

INFO: HttpClient 3 proxy request for http://distro.ibiblio.org/pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2

DEBUG: HttpClient 3 cache position: distro.ibiblio.org/pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2

DEBUG: HttpClient 3 connecting to distro.ibiblio.org

STAT: HttpServer 3 bound to distro.ibiblio.org

DEBUG: HttpServer 3 received header:

  GET /pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2 HTTP/1.0

  host: distro.ibiblio.org

  connection: close

  accept: */*

  user-agent: Wget/1.10.2

INFO: HttpServer 3 serving file from remote host

DEBUG: HttpClient 3 received header:

  HTTP/1.1 200 OK

  content-length: 52228

  accept-ranges: bytes

  server: Apache/2.0.46 (Red Hat)

  last-modified: Tue, 07 Mar 2006 00:08:02 GMT

  connection: close

  etag: "1b32993-cc04-6fc7fc80"

  date: Sat, 22 Apr 2006 18:04:16 GMT

  content-type: application/x-bzip2

DEBUG: HttpServer 3 closed

STAT: HttpServer 3 sent 52228 bytes

INFO: HttpServer 3 cached distro.ibiblio.org/pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2

DEBUG: HttpClient 3 closed

STAT: HttpClient 3 received 52228 bytes

```

And all this again once those files were in cache:

Client:

```

localclient ~ # http_proxy='192.168.x.1:8080' wget http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

--10:10:03--  http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

           => `tzdata2006a.tar.gz.2'

Connecting to 192.168.x.1:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: 149,612 (146K)

100%[=======================================================================>] 149,612       --.--K/s

10:10:03 (10.96 MB/s) - `tzdata2006a.tar.gz.2' saved [149612/149612]

localclient ~ # http_proxy='192.168.x.1:8080' wget http://gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

--10:10:09--  http://gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

           => `netcat-110-patches-1.0.tar.bz2.2'

Connecting to 192.168.x.1:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: 25,751 (25K)

100%[=======================================================================>] 25,751        --.--K/s

10:10:09 (10.46 MB/s) - `netcat-110-patches-1.0.tar.bz2.2' saved [25751/25751]

localclient ~ # http_proxy='192.168.x.1:8080' wget http://distro.ibiblio.org/pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2

--10:10:21--  http://distro.ibiblio.org/pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2

           => `gcc-3.4.5-patches-1.4.tar.bz2.2'

Connecting to 192.168.x.1:8080... connected.

Proxy request sent, awaiting response... 200 OK

Length: 52,228 (51K)

100%[=======================================================================>] 52,228        --.--K/s

10:10:25 (10.76 MB/s) - `gcc-3.4.5-patches-1.4.tar.bz2.2' saved [52228/52228]

localclient ~ #

```

Server:

```

localserver ~ # source /etc/conf.d/http-replicator

localserver ~ # /usr/bin/http-replicator $DAEMON_OPTS

INFO: HttpReplicator started

STAT: HttpClient 1 bound to 192.168.x.2

INFO: HttpClient 1 proxy request for http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

DEBUG: HttpClient 1 cache position: gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

DEBUG: HttpClient 1 checking modification since Sat Apr 22 10:04:01 2006

DEBUG: HttpClient 1 connecting to gentoo.chem.wisc.edu

STAT: HttpServer 1 bound to gentoo.chem.wisc.edu

DEBUG: HttpServer 1 received header:

  GET /gentoo/distfiles/tzdata2006a.tar.gz HTTP/1.0

  host: gentoo.chem.wisc.edu

  if-modified-since: Sat, 22 Apr 2006 18:04:01 GMT

  connection: close

  accept: */*

  user-agent: Wget/1.10.2

INFO: HttpServer 1 serving file from cache

DEBUG: HttpClient 1 received header:

  HTTP/1.1 200 OK

  date: Sat, 22 Apr 2006 18:10:59 GMT

  connection: close

  etag: "837a-2486c-155a4200"

  content-length: 149612

  server: Apache/2.0.54 (Gentoo/Linux)

DEBUG: HttpServer 1 closed

DEBUG: HttpClient 1 closed

STAT: HttpClient 1 received 149612 bytes

STAT: HttpClient 2 bound to 192.168.x.2

INFO: HttpClient 2 proxy request for http://gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

DEBUG: HttpClient 2 cache position: gentoo.osuosl.org/distfiles/netcat-110-patches-1.0.tar.bz2

DEBUG: HttpClient 2 checking modification since Sat Apr 22 10:04:10 2006

DEBUG: HttpClient 2 connecting to gentoo.osuosl.org

STAT: HttpServer 2 bound to gentoo.osuosl.org

DEBUG: HttpServer 2 received header:

  GET /distfiles/netcat-110-patches-1.0.tar.bz2 HTTP/1.0

  host: gentoo.osuosl.org

  if-modified-since: Sat, 22 Apr 2006 18:04:10 GMT

  connection: close

  accept: */*

  user-agent: Wget/1.10.2

INFO: HttpServer 2 serving file from cache

DEBUG: HttpClient 2 received header:

  HTTP/1.1 200 OK

  date: Sat, 22 Apr 2006 18:10:57 GMT

  connection: close

  etag: "2a8bb01-6497-1a0251c0"

  content-length: 25751

  server: Apache

DEBUG: HttpServer 2 closed

DEBUG: HttpClient 2 closed

STAT: HttpClient 2 received 25751 bytes

STAT: HttpClient 3 bound to 192.168.x.2

INFO: HttpClient 3 proxy request for http://distro.ibiblio.org/pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2

DEBUG: HttpClient 3 cache position: distro.ibiblio.org/pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2

DEBUG: HttpClient 3 checking modification since Sat Apr 22 10:04:16 2006

DEBUG: HttpClient 3 connecting to distro.ibiblio.org

STAT: HttpServer 3 bound to distro.ibiblio.org

DEBUG: HttpServer 3 received header:

  GET /pub/linux/distributions/gentoo/distfiles/gcc-3.4.5-patches-1.4.tar.bz2 HTTP/1.0

  host: distro.ibiblio.org

  if-modified-since: Sat, 22 Apr 2006 18:04:16 GMT

  connection: close

  accept: */*

  user-agent: Wget/1.10.2

INFO: HttpServer 3 serving file from cache

DEBUG: HttpClient 3 received header:

  HTTP/1.1 200 OK

  date: Sat, 22 Apr 2006 18:11:13 GMT

  connection: close

  etag: "1b32993-cc04-6fc7fc80"

  content-length: 52228

  server: Apache/2.0.46 (Red Hat)

DEBUG: HttpServer 3 closed

DEBUG: HttpClient 3 closed

STAT: HttpClient 3 received 52228 bytes

```

If I am putting too much information on the posts, let me know.  I flinch everytime I drop debug output on a post.  :Smile: 

Seems to work like a champ when running this way.  I checked the location of the downloaded files and they are in directories according to the mirror from which they were downloaded in /var/cache/http-replicator.

----------

## flybynite

Ok, everything worked normal.

That means replicator is working correctly.  This test eliminated everything but your client setup, because portage uses wget to download files just like the test you did.

So wget can download everything ok, but portage calling wget creates an error.  This really narrows it down.  Probably the client config.

Find all the proxies and configs that affect the client first.

Here is a start, but may not be all the possible settings.

```

source /etc/make.conf

echo $FETCHCOMMAND

echo $RESUMECOMMAND

echo $PROXY

echo $ftp_proxy

echo $http_proxy

```

All but the last should be blank.

Are there any other proxies in use or set anywhere now or in the past?

----------

## Haakon

Alrighty, I really apologize if this turns out to be something I totally over looked...

Here is the output you requested:

```

localserver etc # source /etc/make.conf

localserver etc # echo $FETCHCOMMAND

localserver etc # echo $RESUMECOMMAND

localserver etc # echo $PROXY

localserver etc # echo $ftp_proxy

localserver etc # echo $http_proxy

http://192.168.x.1:8080

localserver etc #

```

"x" is a censor.  The number it repesents is the same number that all my computers have right now.

The computer that the http-replicator is installed on is the only thing close to a proxy that I have, and all the other computers go through it to get to the Internet.  However, there is no proxy service installed.  The gateway computer has a stateful firewall and that is what all the clients run through to get to the Internet.  There hasn't ever been a proxy on my network.

Now, I did, at first, set "127.0.0.1" as the http_proxy variable within the /etc/make.conf file.  I have, however, switched it to the 192.168.x.1 address that we have been seeing.  That change didn't produce any different results.

When you are saying "client" in your last post, are you referring to the computer which has http-replicator installed?  I made the assumption you did.

----------

## flybynite

The box that runs replicator I call the server.  Your other boxes I call the clients.  Client/server isn't always clear especially  in replicators case, but I think of it that way.

I've  assumed the errors you've had were from a box other than the one that runs replicator.  I also assumed that the successfull wget test was from a client box that emerge will fail from.

It really doesn't matter much since the server box running replicator has the same setup as the clients.  The only important info is do both server and clients boxes pass the wget test, and fail the emerge -f test in the same way.

We need to check your /etc/make.conf for an error.  A single space or possibly an unmatched " or ' could cause errors that are hard to spot in these posts.  Check the make.conf on both the server and the clients.  I assume the clients are the same so we can work with just one make.conf from a clients and get that box working first.

Try renaming your make.conf make.conf.good and create a new make.conf with only the few lines that don't start with a # and see if we can narrow it down that way.

Posting the make.conf can help but it is hard to spot spaces and odd characters in posts.

----------

## Haakon

 *flybynite wrote:*   

> 
> 
> The box that runs replicator I call the server.  Your other boxes I call the clients.  Client/server isn't always clear especially  in replicators case, but I think of it that way.
> 
> 

 

That is the way I have been seen it as well, except for the last post.

Here is where I got confused...  I have activated http-replicator like this on the server:

```

source /etc/conf.d/http-replicator

/usr/bin/http-replicator $DAEMON_OPTS

```

and ran the command "emerge -uDf" on the client and it worked perfectly.  The files began to fetch, no errors.  Only when I tried bringing up http-replicator as a service did I have the problem, so I figured it was more the server's problem and not the client's because the client would be making an anonymous http request as if it where a web browser, right?

Right after I typed that I tried it out using a regular browser (Konquerer in this case) on the client and was not able to download the file "tzdata2006a.tar.gz" from http-replicator at the server.  I did remember to put "192.168.x.1:8080" as the http proxy for Konquerer.  When looking at the /var/log/http-replicator log on the server, I see it is the same error with Konquerer as it is with wget.  The moment I tried to use wget as the same regular user that used the browser, it would fail with the same familiar error.  Here is the wget command I used:

```

http_proxy='192.168.x.1:8080' wget http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

```

Before I ran all those experiments I recreated the /etc/make.conf file as you suggested first.

Here it is with the usual censors:

```

localclient ~ $ cat /etc/make.conf

USE="X gtk gtk2 3dnow dvd a52 aac avi cdr encode mp3 mpeg jpeg mozilla java javascript ogg oggvorbis oss samba svga usb gnome alsa gif kde opengl quicktime sse sse2 mmx svga acpi"

CHOST="i386-pc-linux-gnu"

CFLAGS="-mcpu=i586 -O3 -pipe"

CXXFLAGS="${CFLAGS}"

GENTOO_MIRRORS="http://gentoo.chem.wisc.edu/gentoo http://gentoo.osuosl.org http://www.ibiblio.org/pub/Linux/distributions/gentoo"

SYNC="rsync://192.168.x.1/gentoo-portage"

MAKEOPTS="-j2"

AUTOCLEAN="yes"

CCACHE_SIZE="1024M"

http_proxy="192.168.x.1:8080"

localclient ~ $

```

 *flybynite wrote:*   

> 
> 
> I've  assumed the errors you've had were from a box other than the one that runs replicator.  I also assumed that the successfull wget test was from a client box that emerge will fail from.
> 
> 

 

You are right here, too.  It has all been the clients that are unable to use the http-replicator, though I haven't tried using "emerge -f" with the server through http-replicator.

Again, right after I said this, I tried the server and got the same error as the clients.  But you figured this to be the case already.  :Smile: 

----------

## Kobboi

I have recently installed a server that functions as a local portage tree mirror (through rsyncd) and as a package file cache (through http-replicator). Now it would be cool if the server was aware of the packages installed on the clients in the network (their world files), so that, after the server syncs its portage tree, it could prefetch the package files that will be needed by one or more clients in case they wanted to "emerge -avuD world". Is there an elegant way to do this?

----------

## flybynite

 *Haakon wrote:*   

> 
> 
> Here is where I got confused...  I have activated http-replicator like this on the server:
> 
> ```
> ...

 

That sounds interesting.  So replicator works when started on the command line, it doesn't work when started by /etc/init.d/http-replicator?

I'd like to see /etc/init.d/http-replicator.  But again, spaces are hard to spot, so after posting, rename that file and emerge http-replicator again so it will reinstall a new one and see what happens.

----------

## flybynite

 *Kobboi wrote:*   

> Now it would be cool if the server was aware of the packages installed on the clients in the network (their world files)

 

The individual clients USE settings and the world file, and more affect what packages it needs.  There is no way to know what use settings etc are on other boxes so there isn't an easy way to due this.  Portage just doesn't work like this.

However, you're probably remotely running "emerge sync" on each client with a script, so you could just add "emerge -uDvf world" to that script and you get the same result.

So instead of trying to get replicator to magically know what every other client needs, just have all clients perform a fetch of what it needs when you sync them.  You can control when the downloads happen if that is your goal.  Replicator will still ensure that only one copy of each file is downloaded from the net no matter how many clients request that file, even if they all request the file at the same time or in sequence.

----------

## Haakon

 *flybynite wrote:*   

> 
> 
> That sounds interesting.  So replicator works when started on the command line, it doesn't work when started by /etc/init.d/http-replicator?
> 
> 

 

I did a bit more work and found that it is really when http-replicator is ran as anything other than root.  I had changed the --user option in /etc/conf.d/http-replicator to "root" from "portage".  I executed /etc/init.d/http-replicator and ran a wget from the client and everything worked correctly.  I tried changing the --user option to a regular user and I get the happy error we are accustomed to seeing.  I modified the /var/cache/http-replicator permissions to allow anyone full control to make sure it wasn't a permission problem there.  Once I found that wasn't it, I put the permissions back to normal.  I tried running http-replicator from the command line as a regular user instead of root, and again, the same error.

 *flybynite wrote:*   

> 
> 
> I'd like to see /etc/init.d/http-replicator.  But again, spaces are hard to spot, so after posting, rename that file and emerge http-replicator again so it will reinstall a new one and see what happens.
> 
> 

 

I did the above action before I started to mess around with http-replicator.  Nevertheless, here is the /etc/init.d/http-replicator file after the remerge  :Smile: 

```

localserver ~ # cat /etc/init.d/http-replicator

#!/sbin/runscript

# Copyright 1999-2004 Gentoo Technologies, Inc.

# Distributed under the terms of the GNU General Public License v2

# $Header: /var/cvsroot/gentoo-x86/net-proxy/http-replicator/files/http-replicator-3.0.init,v 1.1 2005/06/02 06:33:24 griffon26 Exp $

depend() {

        need net

}

start() {

        ebegin "Starting Http-Replicator"

        start-stop-daemon --start --pidfile /var/run/http-replicator.pid --name http-replicator \

                --startas /usr/bin/http-replicator -- -s -f --pid /var/run/http-replicator.pid --daemon $DAEMON_OPTS

        eend $? "Failed to start Http-Replicator"

}

stop() {

        ebegin "Stopping Http-Replicator"

        start-stop-daemon --stop --pidfile /var/run/http-replicator.pid --name http-replicator \

                --signal 2 --oknodo

        eend $? "Failed to stop Http-Replicator"

}

localserver ~ #

```

----------

## Kobboi

 *flybynite wrote:*   

> The individual clients USE settings and the world file, and more affect what packages it needs.  There is no way to know what use settings etc are on other boxes so there isn't an easy way to due this.  Portage just doesn't work like this.
> 
> 

 

I understand. I guess I was a little too eager in speeding things up. Thanks for your reply.

----------

## flybynite

 *Haakon wrote:*   

> 
> 
> I did a bit more work and found that it is really when http-replicator is ran as anything other than root.

 

Just to make sure I understand correctly.  You started replicator in a terminal logged in as root, with the --user option as both root and portage?  It works as --user root but not portage?

What do you get from this?

```

groups portage

ls -l /var/cache/

ls -l /var/log/http-replicator.log

ls -l /var/run/http-replicator.pid

```

Are you running any sort of security options on this server that would limit user abilities or access like pax, selinux, gresecurity, lids etc.?

----------

## Haakon

 *flybynite wrote:*   

> 
> 
> That sounds interesting.  So replicator works when started on the command line, it doesn't work when started by /etc/init.d/http-replicator?
> 
> I'd like to see /etc/init.d/http-replicator.  But again, spaces are hard to spot, so after posting, rename that file and emerge http-replicator again so it will reinstall a new one and see what happens.
> ...

 

Ok, I am really starting to think there may be something wrong with the python install.  I think the problem may have been a while ago when the original image of the tarball I use had a python update, I didn't do anything like execute "python-updater".  I only learned about that function after you had me reinstall python.  I got to see the warning message that would have otherwise been pushed off the buffer on a full world update.

Granted, it could be any number of dev packages that may not have been updated correctly.

Also, I ran http-replicator on a laptop I have at work that had Gentoo installed normally and http-replicator worked like a champ the first time.

So, I am starting to lean towards grabbing everything in my /etc directory on the server and doing a reload.  However, if you want to explore the issue some more, I have no problems working this down until we find the exact problem.

----------

## Haakon

 *flybynite wrote:*   

> 
> 
> Just to make sure I understand correctly.  You started replicator in a terminal logged in as root, with the --user option as both root and portage?  It works as --user root but not portage?
> 
> 

 

Yes, I started the replicator from a terminal as root by executing this command:

```

/etc/init.d/http-replicator start

```

No, I changed the --user option in the /etc/conf.d/http-replicator file from portage to root, so root would be the only thing after --user, besides the required syntax.  Here is that line from the /etc/conf.d/http-replicator file:

```

GENERAL_OPTS="$GENERAL_OPTS --user root"

```

 *flybynite wrote:*   

> 
> 
> What do you get from this?
> 
> ```
> ...

 

Sure thing:

```

localserver ~ # groups portage

portage

localserver ~ # ls -l /var/cache

total 48

-rw-------   1 root    root      136 Sep  4  2005 dhcpcd-eth0.cache

-rw-------   1 root    root      136 Jul 16  2005 dhcpcd-eth1.cache

drwxrwxr-x   3 root    portage  4096 Apr 23 17:11 edb

drwxrwxr-x   3 portage portage 20480 Apr 23 18:26 http-replicator

drwxrwxr-x  19 root    man      4096 Mar 10  2005 man

drwxr-xr-x   4 root    root     4096 Apr 24 13:50 samba

drwxr-xr-x   5 root    root     4096 Jul 17  2005 setup-tool-backends

drwxr-xr-x   2 squid   squid    4096 Jan 14 20:32 squid

localserver ~ # ls -l /var/log/http-replicator.log

-rw-r--r--  1 root root 9087 Apr 23 18:27 /var/log/http-replicator.log

localserver ~ # ls -l /var/run/http-replicator.pid

-rw-r--r--  1 root root 5 Apr 23 18:27 /var/run/http-replicator.pid

localserver ~ #

```

 *flybynite wrote:*   

> 
> 
> Are you running any sort of security options on this server that would limit user abilities or access like pax, selinux, gresecurity, lids etc.?
> 
> 

 

I didn't install any of these.

----------

## flybynite

 *Haakon wrote:*   

> 
> 
> Ok, I am really starting to think there may be something wrong with the python install.

 

Me too.  But I'm also curious so maybe one or two more things.

First make sure replicator is stopped by running:

```

/etc/init.d/http-replicator stop

killall http-replicator

```

Then in a terminal as root and run replicator from there to try and figure out exactly which option gives problems.  Here is how replicator is started by /etc/init.d/http-replicator.

```

/usr/bin/http-replicator -s -f --pid /var/run/http-replicator.pid --daemon --dir /var/cache/http-replicator --user portage --alias /usr/portage/packages/All:All --log /var/log/http-replicator.log --debug --ip 192.168.x.* --port 8080 

```

Everything after --daemon is from your $DAEMON_OPTS from an earlier post.

Remove your censored x in the ip and start replicator this way and ensure it doesn't work for your clients.  Check with either with emerge -f or the wget method from earlier, your choice.

You can confirm the user replicator is running by this, prior to testing with the clients since it might die after testing.

```

ps aux | grep replicator

```

Starting like this replicator will background itself and you have to kill it by

```

killall http-replicator

```

Then only change --user portage to --user root and try again.  This should work for the clients?  Then kill replicator with the killall command again.

Next try to remove the --deamon and --user opts like this.

```

/usr/bin/http-replicator -s -f --pid /var/run/http-replicator.pid  --dir /var/cache/http-replicator  --alias /usr/portage/packages/All:All --log /var/log/http-replicator.log --debug --ip 192.168.x.* --port 8080

```

Run without the --daemon option it won't background and will run as root.  This should work for the clients also?  Control-c will kill it this time.

----------

## Haakon

Ok, ran these commands on the server:

```

localserver ~ # /etc/init.d/http-replicator stop

 * ERROR:  "http-replicator" has not yet been started.

localserver ~ # killall http-replicator

http-replicator: no process killed

localserver ~ # /usr/bin/http-replicator -s -f --pid /var/run/http-replicator.pid --daemon --dir /var/cache/http-replicator --user portage --alias /usr/portage/packages/All:All --log /var/log/http-replicator.log --debug --ip 192.168.x.* --port 8080

```

Ran this on the client:

```

localclient ~ # http_proxy='192.168.x.1:8080' wget http://gentoo.chem.wisc.edu/gentoo/distfiles/tzdata2006a.tar.gz

```

And got the same error as before.

Next, I ran these commands on the server:

```

localserver ~ # killall http-replicator

http-replicator: no process killed

localserver ~ # /usr/bin/http-replicator -s -f --pid /var/run/http-replicator.pid --daemon --dir /var/cache/http-replicator --user root --alias /usr/portage/packages/All:All --log /var/log/http-replicator.log --debug --ip 192.168.x.* --port 8080

```

When I ran the same client command above on the client, http-replicator worked and I downloaded the file to the client.

Lastly, I ran this command on the server:

```

localserver ~ # killall http-replicator

localserver ~ # /usr/bin/http-replicator -s -f --pid /var/run/http-replicator.pid  --dir /var/cache/http-replicator  --alias /usr/portage/packages/All:All --log /var/log/http-replicator.log --debug --ip 192.168.x.* --port 8080

INFO: HttpReplicator started

```

Http-replicator again worked correctly when running the wget command on the client.

Oh, the last command will work for the clients.   :Razz: 

----------

## flybynite

That narrows it down to a small section of code, but the trouble is not in that section of code.  If it was it would fail with any user, not just a particular user.

The trouble is "which user", not the code.  That particular system seems to have trouble with users other than root performing socket operations.

The only unknowns left are the networking setup and the kernel.  You said that box is a server.  Does it have two networks and two network cards?  Your running iptables.  That creates the possiblity that some type of packet mangling could cause a problem.  But I'm pretty sure you tried the test after flushing all the rules?  I believe /etc/init.d/iptables stop does that nicely.  That would eliminate that  possibility.

The other half of the equation is the kernel.  I'm not familiar with genkernel but I do know it is software so it will have the occasional bug.

The last resort might be to upgrade the kernel.  Make sure to use the same gcc version that compiled python.

You said replicator worked on your laptop.  Might try that kernel version if it is different?  If it isn't different, try another version anyway.

----------

## Haakon

Ok, I am recompiling the kernel.  I didn't do that after the gcc updating that was done.  I will work with the network settings and see what happens.  Yes, I already tried turning off the iptables service and still had the same problem.  I will try turning off the outside interface and changing the default route to the other gateway.

While I am working on this, I was curious about something.  Would the package dnsmasq cause a problem?  The same server that I want to run http-replicator on also runs dnsmasq, but it looks at another dns server to update its own cache.

Oh, the kernels between the server and laptop are the same version.  :Smile:   It is the latest which portage supplied me.

----------

## flybynite

 *Haakon wrote:*   

> Would the package dnsmasq cause a problem?

 

I've never used it but I did take a quick look at the manual.  I don't think it will give problems.  replicator filters based on network addresses, not names.  The dhcp portion could cause trouble if your config gives out network addresses not in the allowed network you have setup in replicators config.

You have --ip 192.168.x.*  in replicators config, if dnsmasq gives out a 192.168.Y.* address replicator will block the request  but should give a security warning, like we got when I didn't have you add $DAEMON_OPTS in the command line run of the server.

----------

## jacooper

I have a suggestion to try:

```
chmod 777 /tmp
```

I was having the exact same issues, http-replicator would work as root but not as anyone else.  This was a freshly installed system without iptables installed.

It turns out that the reason that I has having issues is that /tmp had permissions of 0755.  I changed that to 0777 and everything worked for me. Since only root had write permission to /tmp, only root could create sockets.

Jeff

----------

## flybynite

 *jacooper wrote:*   

> Since only root had write permission to /tmp, only root could create sockets.
> 
> Jeff

 

Dude, awsome!  Thats why I like some troubleshooting.  Gets you to know your stuff!!  Its always the simple things...

I don't know if thats it until haakon checks in, but it fits the problem like a glove right down to the exact error!!!

Do you know how those permissions got there?  Was it that way in the initial install??

----------

## Haakon

OMG, that was exactly it.  It is suposed to have permission 777 with the sticky bit for "others":

```

drwxrwxrwt   5 root root  4096 Apr 29 04:20 tmp

```

This is usually something I change after my tarball install and I obviously forgot about it.  The tarball install leaves the /tmp directory as 755 as jacooper suggested.  Sorry, you two, for the run on this, espcially you, Flybynite.  :Sad:   Ironically, I have already made myself a reminder document on what to change after the tarball install but it was created after this server's installation.

Also, I have caught this problem before, too,(which is how it got into my document  :Smile:  ) but it was on systems that have X installed and X will not load if the users can not create or open a file in the /tmp directory.  Naturally, X was not installed on the server we've been working on to fix. 

edit:  Almost forgot to answer your questions, Flybynite.  Since it is a tarball installation, there isn't really an initial install, but the original computer's /tmp dir is correct.  the /tmp, like the /sys and /proc dir, is a dir that I do not put into the tarball because of the files are not needed or shouldn't be added.  The /tmp dir I create manually before the chroot and that is how it got messed up.  After the chroot, I would normally change the permissions for the /tmp dir, but for whatever reason, I forgot to do so.

Thank you, for the tip, jacooper.

Flybynite, thank you for your patience in all of this and I apologize that this was more my fault than anything else.

----------

## jacooper

You're welcome.  flybynite, did most of the work in diagnosing the issue.  I just happened to have a flash iof intuition after reading all of that...

I'm not sure how it got that way.  I'm setting up another system right now and I've just finished the state3 tarball extraction and the portage tarball extraction and tmp is still drwxrwxrwt.  I'll keep a watch on it and see if I can catch anything changing it.

----------

## Haakon

No, I didn't mean any of the stage tarballs.  Since I have around 20 computers at home, I made one build as i586 and tarballed that system while it was active.  Since it was active, certain directories shouldn't be placed in the tar file.  The /tmp, /sys, and /proc are the main directories I omit and manually create them.

Also, using the system wide tar file is how I do a backup on my computers.  This server probably got its /tmp directory messed up when I restored the build after installing a new hard drive.

----------

## golding

flybynite,

I installed http-replicator on the server about three months ago and it has worked as a cache perfectly.  The problem I am having is getting it to start on boot.  If I check rc-status after bootup, http-replicator is stopped, however, if I then start it via the command line;

```
/etc/init.d/http-replicator start
```

 is stays up until I close down the server.

I'm having the same problem Haakon had (I think) except my /tmp permissions (0777 +t) are fine and if I start it from the command line using (note --user portage)

```
/usr/bin/http-replicator -s -f --pid /var/run/http-replicator.pid --daemon --dir /var/cache/http-replicator --user portage --alias /usr/portage/packages/All:All --log /var/log/http-replicator.log --debug --ip 192.168.99.* --port 8080
```

it works fine.

I have done everything Haakon did, but still no joy on getting http-replicator to start properly on boot.

Any other ideas?

----------

## flybynite

 *golding wrote:*   

> if I then start it via the command line;
> 
> ```
> /etc/init.d/http-replicator start
> ```
> ...

 

This is the same way replicator is started by the init scripts.

 *golding wrote:*   

> if I start it from the command line using (note --user portage)
> 
> ```
> /usr/bin/http-replicator -s -f --pid /var/run/http-replicator.pid --daemon --dir /var/cache/http-replicator --user portage --alias /usr/portage/packages/All:All --log /var/log/http-replicator.log --debug --ip 192.168.99.* --port 8080
> ```
> ...

 

So using both these methods it works for you, so it's not the same problem.

Set the debug option and start it up so it will fail.  Are there messages in replicators log or the system log?

Show me this:

```

rc-update show

```

----------

## golding

 *flybynite wrote:*   

> Are there messages in replicators log or the system log?

 The boot screen shows it starting OK. No errors listed in the system log.

The following is from when I last rebooted the machine yesterday

```
06 May 2006 09:38:17 INFO: HttpReplicator terminated

06 May 2006 12:05:58 INFO: HttpReplicator started

06 May 2006 12:19:02 INFO: HttpReplicator started

07 May 2006 09:08:00 STAT: HttpClient 1 bound to 192.168.99.1
```

About a quarter hour after I rebooted I checked with rc-status, found it "Stopped" so ran it with  *Quote:*   

> one rob # /etc/init.d/http-replicator restart
> 
> * starting http-replicator ...             [OK]

 As you can see, it shows when I shut down in the morning, rebooted at lunch and restarted it after my cuppa, but, nothing to show it stopping in between the two starts.

 *flybynite wrote:*   

> Set the debug option and start it up so it will fail.  Are there messages in replicators log or the system log?

 OK, is that a switch, or do I need to re-emerge the package

 *flybynite wrote:*   

> Show me this:
> 
> ```
> 
> rc-update show
> ...

 Right here

```
one rob # rc-update show

           alsasound |      default

            bootmisc | boot

             checkfs | boot

           checkroot | boot

               clock | boot

         consolefont | boot

               cupsd |      default

                dbus |      default

            gkrellmd |      default

                 gpm |      default

                hald |      default

            hostname | boot

             hotplug | boot

     http-replicator |      default

            iptables |      default

             keymaps | boot

          lm_sensors | boot

               local |      default nonetwork

          localmount | boot

             modules | boot

            net.eth0 |      default

              net.lo | boot

            netmount |      default

                 nfs |      default

          ntp-client |      default

                ntpd |      default

             numlock |      default

             portmap |      default

           rmnologin | boot

              rsyncd |      default

           syslog-ng |      default

             urandom | boot

                 xdm |      default

                 xfs |      default

              xinetd |      default

```

----------

## flybynite

 *golding wrote:*   

> About a quarter hour after I rebooted I checked with rc-status, found it "Stopped"
> 
> 

 

You would only get the status "stopped" if replicator was shutdown with the init scripts.  If replicator died, you would get an error message when you tried to restart it with the 'restart' or 'start' command.

Here I check the status, start, kill replicator, then try to start or restart.  Notice the errors.

```

gate1 ~ # /etc/init.d/http-replicator status

 * status:  stopped

gate1 ~ # /etc/init.d/http-replicator start

 * Starting Http-Replicator ...                                                                                                                       [ ok ]

gate1 ~ # killall http-replicator

gate1 ~ # /etc/init.d/http-replicator status

 * status:  started

gate1 ~ # /etc/init.d/http-replicator start

 * WARNING:  "http-replicator" has already been started.

gate1 ~ # /etc/init.d/http-replicator restart

 * Stopping Http-Replicator ...

No http-replicator found running; none killed.                                                                                                        [ ok ]

 * Starting Http-Replicator ...                                                                                                                       [ ok ]

```

I suspect replicator is being stopped by the init scripts.  Maybe because you start and stop your network?

replicator "needs net" and if you stop your net, replicator will stop also.

```

gate1 ~ # /etc/init.d/net.eth0 stop

 * Stopping Http-Replicator ...                                                                                                                       [ ok ]

 * Unmounting network filesystems ...                                                                                                                 [ ok ]

 * Stopping ntpd ...                                                                                                                                  [ ok ]

 * Stopping sshd ...                                                                                                                                  [ ok ]

 * Stopping eth0

 *   Bringing down eth0

 *     Stopping dhcpcd on eth0 ...                                                                                                                    [ ok ]

 *     Shutting down eth0 ...                                                                                                                         [ ok ]

```

Do you use the init scripts to bring your network up or down?

----------

## golding

 *flybynite wrote:*   

> You would only get the status "stopped" if replicator was shutdown with the init scripts.

 Agreed, but what script? *flybynite wrote:*   

> I suspect replicator is being stopped by the init scripts.  Maybe because you start and stop your network?

 The network comes up, then stays up.  Pinging the other machines from boot keeps going, before, during and after whatever is going on with the replicator. *flybynite wrote:*   

> Do you use the init scripts to bring your network up or down?

 I startup the server, when it is up, I then start the clients (clients need the server up first for nfsmount of 'home').  All of it is done via the init scripts.

I am not known for being any good at fault finding.  I am more a "fudger" than expert, so this morning I thought I would try something different.

I started up the server, then clients, as I usually do, then ran a script to ping the clients from the server.  I also ran a script to run "rc-status" every 30 seconds so I could see when (and if) http-replicator stopped.  From the clients I ran emerge to download a large set of files from the server.

I did this in case there was something closing the replicator down from non use, a sort of service automounter?  It seems I was wrong as it still stopped, even when it was up and running in use.

What happened is that it all worked for seventeen minutes, then 'rc-status' reported http-replicator "Stopped" and the clients showed connection denied for emerge at the same time.  During all this the ping and nfs maintained their connections, so I can safely say the network stayed up.

I do not know what caused http-replicator to stop at seventeen minutes, the log still only shows termination from last night and start for this morning.  When I restarted the replicator, the log only showed an additional 'start', still no entry for the termination at seventeen minutes.  There was also no error showing the replicator was already stopped when I restarted it.

I re-emerged http-replicator, net-tools, python, nfs, nfs-tools, ntp, portmap, ... etc, last night in case a bug in the install was the cause.

----------

## flybynite

Ok, hmm....

Debug is in /etc/conf.d/http-replicator, remove the # before this line:

```

## Make the log messages extra verbose for debugging.

DAEMON_OPTS="$DAEMON_OPTS --debug"

```

Restart replicator and try the test again.  Should see some more output.

----------

## wa1gon

 :Crying or Very sad:  I just installed http-replicator and I have started to have to get the message:

!!! Digest verification Failed:

!!!    /usr/portage/distfiles/bash-3.1.tar.gz

!!! Reason: Failed on MD5 verification

on just about any package I try to install.  Did I miss something?

help

thanks

Darryl -- de WA1GON

----------

## golding

 *flybynite wrote:*   

> Ok, hmm....
> 
> Debug is in /etc/conf.d/http-replicator, remove the # before this line:
> 
> ```
> ...

 

OK, I activated debug and restarted.

Now the behaviour is getting really weird.  It is showing [started] but not allowing connections, then showing [stopped]  from boot, then [started]  for awhile and changing to [stopped]  ... you get the idea?  During all this it only shows normal output to the log file, no extra information  :Sad: 

Oh well, I think something is messed up with my total Gentoo install, maybe a package doing something totally unrelated but using net in some way, and is manifesting itself through the replicator (weirder things do happen), so until I have re-installed the entire Gentoo from scratch again I'll let this one lie.  Probably in a month os so, I don't feel so masochistic at the moment  :Confused: 

Until then I'll just have to restart the replicator when needed.  After all, it seems this is only happening on my systems, not any others, otherwise I reckon we would have had a couple of "me too" posts.

Thankyou for your time and effort flybynite  :Smile: 

----------

## golding

Progress! ... sort of  :Confused: 

I have been keeping tabs on the replicator behaviour for the last two days and noticed something different.

If I startup and work the server without root having been logged in, it all works properly.  As soon as I login to the server as root, either by su or in a console, the replicator stops.  Then I restart it from command line as root, thereafter it keeps working fine.

I hadn't noticed the correlation between this and the replicator before as I wasn't working without logging into root fairly soon for various reasons.  The seventeen minute delay before was when I first su'd to do something.  The varying behaviour was for the same reason, if I su'd straight away, replicator stopped then, if I did other things first, then su'd, the replicator stopped then.  Even the scripts I ran to test it were ran from a user account, so I didn't see it then either.

I've never used ssh, so I don't know if this would happen if I logged into the server via ssh as root.  I also haven't tried sudo [command] as I never really used it before (much easier to su, then run whatever you need to).

I cannot see anything in profile or bashrc that could cause a problem, they are stock standard excepting an alias for ls --color for root.

Sooo .. any other ideas?

----------

## Conditional_Zenith

Are there any solutions which use apache to do this?  I already use apache anyway, and I would like to avoid having 2 different http servers running on my box.

----------

## flybynite

 *Conditional_Zenith wrote:*   

> Are there any solutions which use apache to do this?

 

Yes, I originally used apache!!

See my    HOWTO: tsp-cache - Streaming Distfile Cache for LAN Use  for tsp cache which I wrote back in 2003!!.  ( I can't believe it's still there! )

It is tailor made for someone already running apache, but apache has some serious limitations as a distfile server.  http-replicator is much more featured and powerful.

The download link in the howto is dead.  If your still interested we can find a way to get it to ya, it's only 11kb!

----------

## Scen

Hello to everybody!

I still have the "service start" problem of http-replicator, that is the boot process shows "Starting Http-Replicator    [OK]" but, when i check the existence of the http-replicator process, it doesn't exists.

Indeed, if i restart the service, it starts without problems with the following message

```

# /etc/init.d/http-replicator restart

 * Stopping Http-Replicator ...

No http-replicator found running; none killed.                            [ ok ]

 * Starting Http-Replicator ...                                           [ ok ]

#

```

This problem appeared with the latest stable version of the 2.6 kernel (>= 2.6.13, maybe) , while with a 2.4 kernel (or a early 2.6) the service starts fine.

If i don't put http-replicator service in "default" runlevel and, after the system have booted, i run the command

```

/etc/init.d/http-replicator start

```

the service starts fine.

I have tried to add "--debug" and "--intolerant" options in /etc/conf.d/http-replicator but the log doesn't have useful informations, and the runscript seems to start the service without problems.

flybynite, i'm in your hands!   :Rolling Eyes: 

p.s. sorry for my bad english!

----------

## golding

I shifted http-replicator to one of my machines running stable (x86) only, and it now works fine.  All the problems I had with it on my server have disappeared.

Maybe this is to do with using ~x86?

 edit (1) - changed sentence to read correctly

----------

## flybynite

 *golding wrote:*   

> 
> 
> Maybe this is to do with using ~x86?
> 
> 

 

Could be.  Could you make some small tests to see if you could narrow it down to which packages....  Replicator should only be interested in python or kernel versions.  Is python or kernel different on the two boxes?

Although I still think something is changeing when you login with ssh.  Maybe ssh is doing some port forwarding or other trickery for you?

----------

## flybynite

 *Scen wrote:*   

> 
> 
> I still have the "service start" problem of http-replicator, that is the boot process shows "Starting Http-Replicator    [OK]" but, when i check the existence of the http-replicator process, it doesn't exists.
> 
> 

 

The last person with this error had incorrect permissions in the temp dir.  Show me

```

ls -l /

```

If that looks like

```

drwxrwxrwt  19 root root  1384 May 31 16:12 tmp

```

Then follow my posts with Haakon here:

https://forums.gentoo.org/viewtopic-t-173226-postdays-0-postorder-asc-start-416.html

Don't worry about his replies, just run the tests I ask him.  I should make this into a FAQ someday so I don't have to repeat it too often...

----------

## Havin_it

Hi all  :Smile: 

I've used http-replicator with no issues for a few months now, but suddenly I'm getting a problem with repcacheman.  When I run it, it parses the DISTDIR, finds all the ebuilds, but then hangs interminably (well, 10min and over) on the line "Extracting the checksums...."

Any idea what's doing this?  I've just done an emerge -uD world, but didn't take note of what packages were merged.  I don't think there's been an update to http-replicator for a while, though I re-emerged it just in case the install was b0rked, but that's made no difference.

Nothing alarming in the logs either.  What else can I try?

EDIT: Never mind, it completed eventually, just took a really long time.  Still curious as to why though...

----------

## flybynite

 *Havin_it wrote:*   

> Hi all 
> 
> EDIT: Never mind, it completed eventually, just took a really long time.  Still curious as to why though...

 

You might be having a problem with your setup or your selected http mirrors.  This might cause ALL your downloads to come from ftp and slow repcacheman way down.  repcacheman only does the full blown md5sum check on the rare file that isn't available in GENTOO_MIRRORS in your /etc/make.conf.  Make sure you have a full set of healthy http mirrors in there.

I should also add that portage is disk limited during this check, if another disk intensive process is running at the same time, they both will slow way down.

However, to answer your question. 

repcacheman uses portage databse functions to read every single ebuild in the whole portage tree.  This takes time.  This will take more time as the tree grows.  Also portage isn't optimized to do the database lookups repcacheman does.  Repcacheman could really use its own database to optimize the queries, but I haven't had any other complaints about the speed since 99.9% of the files should be on an http mirror and repcacheman should only rarely have to do the full blown md5sum check.

Portage is under heavy development now, and portage 2.1 + is much faster due to structural changes and optimizations.  Hopefully that will translate to a faster repcacheman also.

I've noticed a slowdown in stable portage.  I believe either portage 2.0 is buckling under the load or maybe is slowing down because of the changes being made behind the scenes for portage 2.1, which isn't far from being released as stable, but is very different in the database department.

You may not need to run repcachman as often if you are finding the run time troublesome.  Once replicator is installed and running healthy, repcacheman only serves to transfer a couple of files that may have been only available on ftp to replicators cache.  If you don't run repcachman, the worst that will happen is that individual boxes may rarely have to ftp one or two files themselves.  This might mean that only 99.9% of the files are downloaded once and cached, instead of 100%.  If your running hundreds of clients and paying per MB downloaded that might be unacceptable, otherwise running repcacheman less often could be for you.

----------

## hpeters

I am having a problem where http-replicator never even receives the request from portage when emerging a file.

This only happens on the server ( the machine running http-replicator ). All the other computers on the lan work great.

Every thing is set up correctly the http_proxy variable is set, gentoo mirrors is set. But when i emerge a file portage connects directly to the portage mirror and never even attempts to connect to htp-replicator.

I am using http-replicator 3.0 and the latest ~x86 version of portage. I have tried with the "stable" version of portage and it didn't work either.

Any ideas ?

----------

## hpeters

Ok i found the problem wgetrc located at /etc/wget/wgetrc had the configuration variable use_proxy=off set.

Commenting that out fixed the problem. How it got set in the first place i don't know. I have never modified that config file in my life.

Harley

----------

## phsdv

Since a while, probably since unmaksing portage2.1, I have some issues with repcacheman. see code below

```
# /usr/bin/repcacheman

Replicator's cache directory: /var/cache/http-replicator/

Portage's DISTDIR: /usr/portage/distfiles/

Comparing directories....

Done!

Deleting duplicate file(s) in /usr/portage/distfiles/

http-replicator_3.0.tar.gz

Done!

New files in DISTDIR:

pvr_2.0.24.23035.zip

... some more files ...

gcc-3.4.5-ssp-1.0.tar.bz2

Checking authenticity and integrity of new files...

Searching for ebuilds...

Done!

Found 23941 ebuilds.

Extracting the checksums....

Traceback (most recent call last):

  File "/usr/bin/repcacheman.py", line 162, in ?

    digestpath = os.path.dirname(digestpath)+"/files/digest-"+pv

  File "/usr/lib/python2.4/posixpath.py", line 119, in dirname

    return split(p)[0]

  File "/usr/lib/python2.4/posixpath.py", line 77, in split

    i = p.rfind('/') + 1

AttributeError: 'NoneType' object has no attribute 'rfind'

```

Is this a buf from repcacheman or from my system? Does anyone know how to fix this?

[edit]I added some print statements and found why it stops:

```
pv kuroo-0.80.2

digestpath: /usr/portage/app-portage/kuroo/kuroo-0.80.2.ebuild

digestpath: /usr/portage/app-portage/kuroo/files/digest-kuroo-0.80.2

pv kuroo8-svn-0_beta1

digestpath: None

Traceback (most recent call last):

  File "/usr/bin/repcacheman.py", line 163, in ?

    digestpath = os.path.dirname(digestpath)+"/files/digest-"+pv

  File "/usr/lib/python2.4/posixpath.py", line 119, in dirname

    return split(p)[0]

  File "/usr/lib/python2.4/posixpath.py", line 77, in split

    i = p.rfind('/') + 1

AttributeError: 'NoneType' object has no attribute 'rfind'
```

I will remove this package. and I expect it to work again.[/edit]

----------

## flybynite

 *phsdv wrote:*   

> Since a while, probably since unmaksing portage2.1, I have some issues with repcacheman

 

This comes up from time to time.  repcacheman just uses the portage API to access the data so when portage is updated, repcacheman is updated also.  Of course someday the portage API might change, but since it still works for me I don't think that has happened yet  :Smile: 

The problem your having usually means an illegal/missing/invalid entry is in the portage database.  A gentoo dev will fix this in a few hours or maybe a day.  repcacheman will probably work fine after your next sync.

Of course, you might have put the bad data in the database yourself.  Do you have any portage overlays?

If this error still happens after your next sync, temporarily remove any overlay's you have to see if one of them is the cause.

[edit]  Seems you've found the offending ebuild.  If it is still broke after your next sync file a bug to the maintainer of that ebuild....

----------

## phsdv

 *flybynite wrote:*   

> [edit]  Seems you've found the offending ebuild.  If it is still broke after your next sync file a bug to the maintainer of that ebuild....

   :Embarassed:  the offending ebuild maker was found, the ebuild was in a local overlay which I had made myself  :Embarassed: 

----------

## bodelicious

i can ping a server

dhcpcd is running

but when i want to download some packages through emerge it hangs saying > " ...connected. HTTP request sent, awaiting response... No data received."

and eventually 

"Giving up."

i didn't find any answer to your old post but i assume you got it fixed

could you help me to solve this problem please?

thx!

 *xkb wrote:*   

> I get the following error on the machine running the proxy:
> 
> ```
> 
> Resolving myhostname.net... someip
> ...

 

----------

## flybynite

First, check your config file on the server, since other clients work fine, most likely the server /etc/make.conf file has errors.

I need to make a troubleshooting FAQ soon, but then try to follow this exchange between haakon and I.  You may not have the same problem, but the troubleshooting steps are the same.

Perform the steps I have haakon do to start the troubleshooting.

https://forums.gentoo.org/viewtopic-t-173226-postdays-0-postorder-asc-start-413.html

----------

## Kobboi

Forgive me for not reading the whole thread before asking my questions   :Embarassed: 

(If you really hold this against me, replies like "it's in the thread" are allowed)

- What happens if client 1 wants a (huge) file and client 2 as well, requesting it before it has been completely downloaded by http-replicator? client 2 gets a partial (invalid) file? no all that bad in an emerge context, but i'd like to know.

- I often do an emerge -avuDf (=complete update, but fetch only), just to have the necessary files already in place for when eventually do update. One package (ivtv) keeps downloading two files over ftp, although they are in the cache. What is the reason for this and how do I "solve" this?

----------

## flybynite

 *Kobboi wrote:*   

> 
> 
> - What happens if client 1 wants a (huge) file and client 2 as well, requesting it before it has been completely downloaded by http-replicator? client 2 gets a partial (invalid) file?
> 
> 

 

NEVER!

http-replicator is smart.  both clients get exactly the correct file and only 1 copy is ever downloaded from the net.  If a long download is in progress and another client requests the same file, the new client will receive the downloaded part quickly from the cache and will "catch up" with the first client download still in progress.  Then both clients will then get the file streamed simultaneously as it is downloaded from the net and saved to the cache!  This works where 2 clients or 200+ clients make requests.

 *Kobboi wrote:*   

> 
> 
> - I often do an emerge -avuDf (=complete update, but fetch only), just to have the necessary files already in place for when eventually do update. One package (ivtv) keeps downloading two files over ftp, although they are in the cache. What is the reason for this and how do I "solve" this?

 

this is in the long(ish) HOWTO itself, although probably not obvious

from the howto

```

Also, some packages in portage have a RESTRICT="nomirror" option which will prevent portage from checking replicator for those packages. The following will override this behavior. Create the file "/etc/portage/mirrors" containing:

Code:

# Http-Replicator Override for FTP and RESTRICT="nomirror packages

local http://gentoo.osuosl.org

You can replace gento.osuosl.org with your favorite HTTP:// mirror. If you already have a local setting, don't worry, as long as it is an http mirror this will still be effective.

```

It isn't obvious this is the case with ivtv but this will confirm.

```

# grep mirror  /usr/portage/media-tv/ivtv/ivtv-0.7.0.ebuild

RESTRICT="nomirror"

```

man portage

```

/etc/portage/mirrors

                     Whenever  portage  encounters  a mirror:// style URL it will look up the actual

                     hosts here.  If the mirror set is not found here, it will check the global mir-

                     rors  file at /usr/portage/profiles/thirdpartymirrors.  You may also set a spe-

                     cial mirror type called "local".  This list of mirrors will be  checked  before

                     GENTOO_MIRRORS and will be used even if the package has RESTRICT="mirror".

```

restrict=nomirror is a portage function and I have no control over it or why it exists.  This workaround must be done on all clients.

At least there is a workaround for restrict=nomirror, there is also a restrict=nofetch which has no workaround  :Sad:   , but there are only a few such ebuilds.

----------

## tek0

Is is usual that repcacheman.py needs an eternity for extracting the digests? On my box, it takes about 20 minutes at full CPU usage...a niceness param would be helpful  :Smile: 

----------

## dalek

```
nice -5 repcacheman
```

That should work.

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## flybynite

 *tek0 wrote:*   

> Is is usual that repcacheman.py needs an eternity for extracting the digests? 

 

repcacheman uses portage database functions to read every single ebuild in the whole portage tree. This takes time. This will take more time as the tree grows. Also portage isn't optimized to do the database lookups repcacheman does. Repcacheman could really use its own database to optimize the queries, but I haven't had many other complaints about the speed since 99.9% of the files should be on an http mirror and repcacheman should only rarely have to do the full blown md5sum check.

You may not need to run repcachman as often if you are finding the run time troublesome. Once replicator is installed and running healthy, repcacheman only serves to transfer a couple of files that may have been only available on ftp to replicators cache. If you don't run repcachman, the worst that will happen is that individual boxes may rarely have to ftp one or two files themselves. This might mean that only 99.9% of the files are downloaded once and cached, instead of 100%. If your running hundreds of clients and paying per MB downloaded that might be unacceptable, otherwise running repcacheman less often could be for you.

Also make sure you delete the distfiles after running repcachman.  A partially downloaded file left in the distfile dir will cause the full blown md5 check every time repcacheman is run.  I don't do this automatically because some people use replicator to cache files that are not in portage.

I do it like this

```

repcacheman && rm -r /usr/portage/distfiles

```

Unless you use replicator to cache other files, anything left in your distfile dir after repcacheman runs is 

1. No longer in Portage

2. Incomplete or corrupt

3. Just plain junk

4.  Wasting your time and disk space

----------

## tek0

 *flybynite wrote:*   

>  *tek0 wrote:*   Is is usual that repcacheman.py needs an eternity for extracting the digests?  
> 
> repcacheman uses portage database functions to read every single ebuild in the whole portage tree. This takes time. This will take more time as the tree grows. Also portage isn't optimized to do the database lookups repcacheman does. Repcacheman could really use its own database to optimize the queries, but I haven't had many other complaints about the speed since 99.9% of the files should be on an http mirror and repcacheman should only rarely have to do the full blown md5sum check.
> 
> You may not need to run repcachman as often if you are finding the run time troublesome. Once replicator is installed and running healthy, repcacheman only serves to transfer a couple of files that may have been only available on ftp to replicators cache. If you don't run repcachman, the worst that will happen is that individual boxes may rarely have to ftp one or two files themselves. This might mean that only 99.9% of the files are downloaded once and cached, instead of 100%. If your running hundreds of clients and paying per MB downloaded that might be unacceptable, otherwise running repcacheman less often could be for you.
> ...

 

Oh, I see. I thought repcacheman was essential for all files, so I ran it daily through cron, and there were about 10 corrupt or incomplete files in $DISTDIR that seem to have caused the trouble. Guess I'll just nice 15 && rm $DISTDIR (I want to work meanwhile, dalek  :Wink:  ) and it will be fine.

----------

## ribx

 *flybynite wrote:*   

> 
> 
> Bandwidth limiting wasn't one being considered  Nobody ever asked for it before...  Http-Replicator is all Python and doesn't use any external programs such as wget.

 

i found this on page 4. i would also like to see this feature  :Smile:  would be much easier than doing this throught the kernel or something. its just that my brother cannot play and my mother cannot surf while i'm downloading updates.

but also without that feature, http-replicator is still a realy good thing - and so easy to set up. good job!

-ribx

----------

## RichieB

Greetings. 

 After installing http-replicator as per the howto, along with a sync mirror, I get the following error when attempting to emerge -uDag: 

 ====== 

 These are the packages that would be merged, in order: 

 Fetching binary packages info... 

 date: Fri, 06 Oct 2006 11:16:14 GMT 

 connection: close 

 <html><body><pre><h1>Not found</h1></pre></body></html> 

 address: /All 

 Traceback (most recent call last): 

 File "/usr/bin/emerge", line 4049, in ? 

 emerge_main() 

 File "/usr/bin/emerge", line 4044, in emerge_main 

 myopts, myaction, myfiles, spinner) 

 File "/usr/bin/emerge", line 3452, in action_build 

 myopts, myparams, spinner) 

 File "/usr/bin/emerge", line 675, in __init__ 

 trees["/"]["bintree"].populate( 

 File "/usr/lib/portage/pym/portage.py", line 5434, in populate 

 self.remotepkgs = getbinpkg.dir_get_metadata( 

 File "/usr/lib/portage/pym/getbinpkg.py", line 450, in dir_get_metadata 

 filelist = dir_get_list(baseurl, conn) 

 File "/usr/lib/portage/pym/getbinpkg.py", line 295, in dir_get_list 

 raise Exception, "Unable to get listing: %s %s" % (rc,msg) 

 Exception: Unable to get listing: 404 Server did not respond successfully (404: Not Found) 

 ====== 

 The server /etc/conf.d/http-replicator: 

 ====== 

 ## Config file for http-replicator 

 ## sourced by init scripts automatically 

 ## GENERAL_OPTS used by repcacheman 

 ## DAEMON_OPTS used by http-replicator 

 ## Set the cache dir 

 GENERAL_OPTS="--dir /var/cache/http-replicator" 

 ## Change UID/GID to user after opening the log and pid file. 

 ## 'user' must have read/write access to cache dir: 

 GENERAL_OPTS="$GENERAL_OPTS --user portage" 

 ## Don't change or comment this out: 

 DAEMON_OPTS="$GENERAL_OPTS" 

 ## Do you need a proxy to reach the internet? 

 ## This will forward requests to an external proxy server: 

 ## Use one of the following, not both: 

 #DAEMON_OPTS="$DAEMON_OPTS --external somehost:1234" 

 #DAEMON_OPTS="$DAEMON_OPTS --external username:password@host:port" 

 ## Local dir to serve clients. Great for serving binary packages 

 ## See PKDIR and PORTAGE_BINHOST settings in 'man make.conf' 

 ## --alias /path/to/serve:location will make /path/to/serve 

 ## browsable at http://http-replicator.com:port/location

 DAEMON_OPTS="$DAEMON_OPTS --alias /usr/portage/packages/All:All" 

 ## Dir to hold the log file: 

 DAEMON_OPTS="$DAEMON_OPTS --log /var/log/http-replicator.log" 

 ## Make the log messages less and less verbose. 

 ## Up to four times to make it extremely quiet. 

 #DAEMON_OPTS="$DAEMON_OPTS --quiet" 

 #DAEMON_OPTS="$DAEMON_OPTS --quiet" 

 ## Make the log messages extra verbose for debugging. 

 DAEMON_OPTS="$DAEMON_OPTS --debug" 

 ## The ip addresses from which access is allowed. Can be used as many times 

 ## as necessary. Access from localhost is allowed by default. 

 DAEMON_OPTS="$DAEMON_OPTS --ip 192.168.*.*" 

 DAEMON_OPTS="$DAEMON_OPTS --ip 10.*.*.*" 

 DAEMON_OPTS="$DAEMON_OPTS --ip 127.*.*.*" 

 ## The proxy port on which the server listens for http requests: 

 DAEMON_OPTS="$DAEMON_OPTS --port 8001" 

 ====== 

 The server /etc/rsyncd.conf: 

 ====== 

 # Copyright 1999-2004 Gentoo Foundation 

 # Distributed under the terms of the GNU General Public License v2 

 # $Header: /var/cvsroot/gentoo-x86/app-admin/gentoo-rsync-mirror/files/rsyncd.conf,v 1.6 2004/07/14 21:12:47 agriffis Exp $ 

 uid = nobody 

 gid = nobody 

 use chroot = yes 

 max connections = 20 

 pid file = /var/run/rsyncd.pid 

 motd file = /etc/rsync/rsyncd.motd 

 transfer logging = no 

 log format = %t %a %m %f %b 

 syslog facility = local3 

 timeout = 300 

 #[gentoo-x86-portage] 

 #this entry is for compatibility 

 #path = /usr/portage 

 #comment = Gentoo Linux Portage tree 

 [gentoo] 

 #modern versions of portage use this entry 

 path = /usr/portage 

 comment = Gentoo Linux Portage tree mirror 

 exclude = distfiles 

 ====== 

 And the server /etc/make.conf: 

 ====== 

 # These settings were set by the catalyst build script that automatically built this stage 

 # Please consult /etc/make.conf.example for a more detailed example 

 #CFLAGS="-O2 -march=i686 -pipe" 

 CFLAGS="-march=athlon-mp -O2 -fomit-frame-pointer -pipe" 

 CHOST="i686-pc-linux-gnu" 

 CXXFLAGS="${CFLAGS}" 

 MAKEOPTS="" 

 ACCEPT_KEYWORDS="" 

 USE=" -* -arts -gnome -gtk 3dnow 3dnowext 3ds X aac acpi alsa apm amr audiofile asf bitmap-fonts bzip2 cdparanoia cdr clamav crypt ctype cups dbus dvd dvdread effects encode exif expat fam fastbuild ffmpeg foomaticdb force-cgi-redirect gd gdbm gif glut gmp gpm gtk gtk2 hal idn imlib ipv6 jpeg jpeg2k kde kdgraphics lcms libg++ libvisual libwww lm_sensors mad memlimit mikmod mmx mmxext mng motif mozilla mp3 mpeg mplayer msn ncurses nls nptl nptlonly nsplugin ogg openal opengl oss pam pcre perl png posix qt3 quicktime readline reiserfs scanner sdl session simplexml slang sockets spamassassin spell sse ssl svg svga tcltk tcpd theora threads tiff truetype truetype-fonts type1-fonts udev usb v4l visualization vorbis win32codecs wmf x264 xine xml xml2 xsl xv xvid xvmc yahoo zlib java faad xcomposite dvd cdrom ivman pmount" 

 FEATURES="-sandbox ccache buildpkg" 

 LINGUAS="en_GB" 

 GENTOO_MIRRORS="ftp://ftp.mirrorservice.org/sites/www.ibiblio.org/gentoo/ http://gentoo.osuosl.org/"

 INPUT_DEVICES="keyboard mouse" 

 HTTP_PROXY="http://localhost:8001" 

 VIDEO_CARDS="nvidia" 

 PORTDIR_OVERLAY=/usr/local/portage 

 ====== 

 and the client /etc/make.conf: 

 ====== 

 # These settings were set by the catalyst build script that automatically built this stage 

 # Please consult /etc/make.conf.example for a more detailed example 

 #CFLAGS="-O2 -march=i686 -pipe" 

 CFLAGS="-march=athlon-mp -O2 -fomit-frame-pointer -pipe" 

 CHOST="i686-pc-linux-gnu" 

 CXXFLAGS="${CFLAGS}" 

 MAKEOPTS="" 

 ACCEPT_KEYWORDS="" 

 USE=" -* -arts -gnome -gtk 3dnow 3dnowext 3ds X aac acpi alsa apm amr audiofile asf bitmap-fonts bzip2 cdparanoia cdr clamav crypt ctype cups dbus dvd dvdread effects encode exif expat fam fastbuild ffmpeg foomaticdb force-cgi-redirect gd gdbm gif glut gmp gpm gtk gtk2 hal idn imlib ipv6 jpeg jpeg2k kde kdgraphics lcms libg++ libvisual libwww lm_sensors mad memlimit mikmod mmx mmxext mng motif mozilla mp3 mpeg mplayer msn ncurses nls nptl nptlonly nsplugin ogg openal opengl oss pam pcre perl png posix qt3 quicktime readline reiserfs scanner sdl session simplexml slang sockets spamassassin spell sse ssl svg svga tcltk tcpd theora threads tiff truetype truetype-fonts type1-fonts udev usb v4l visualization vorbis win32codecs wmf x264 xine xml xml2 xsl xv xvid xvmc yahoo zlib java faad xcomposite dvd cdrom ivman pmount" 

 FEATURES="-sandbox ccache" 

 LINGUAS="en_GB" 

 SYNC="rsync://localhost/gentoo" 

 HTTP_PROXY="http://localhost:8001" 

 PORTAGE_BINHOST="http://localhost:8001/All" 

 INPUT_DEVICES="keyboard mouse" 

 VIDEO_CARDS="nvidia" 

 PORTDIR_OVERLAY=/usr/local/portage 

 ====== 

 Please note that the http-replicator port has been set to 8001, as the client is running an HTTP proxy of its own for internet access. 

 There are 50 packages in the server's /usr/portage/packagas/All directory. 

 On oddity you may notice is, in fact, that the server is running within a chroot environment on the client machine. This is simply due to the fact that this machine has a particular configuration that I'm building for and, at the moment, I only have one of them! There are not shared directories betwixt the 'parent' environement (the 'client') and the chroot environment (the 'server'). 

 Also, when running an emerge -av world --deep on the server (running within its chroot), then all is fine - presumably the http-replicator proxy is doing its job? 

 Many thanks for any and all help!! 

 RichieB

----------

## golding

 *RichieB wrote:*   

> //snip
> 
>  HTTP_PROXY="http://localhost:8001" 
> 
> //snip 
> ...

 

When I installed http-replicator I had to put "http_proxy" in lower case, i.e.

```
http_proxy="http://server.DNS/Name:8080"
```

has that changed?

----------

## RichieB

Thanks Robert, but I don't think that's the issue. http-replicator is successfully negotiating with the client because the error is coming from the proxy server itself.

RB

----------

## RichieB

*bump*   :Embarassed: 

----------

## think4urs11

 *RichieB wrote:*   

> Thanks Robert, but I don't think that's the issue. http-replicator is successfully negotiating with the client because the error is coming from the proxy server itself.

 Nevertheless http_proxy is not the same thing as HTTP_PROXY. Have you tried to exchange your uppercase version with the lowercase version (on both client AND server) already?

----------

## RichieB

Yes, I have - both lower and upper case! Is there any debug I could find which would help?

 :Surprised: 

----------

## RichieB

As an aside, why would I need http_proxy to be lowercase when the other variables in make.conf are upper case?

Additionally, here is the tail of http-replicator.log on the server:

10 Oct 2006 01:10:31 INFO: HttpReplicator started

10 Oct 2006 00:10:38 STAT: HttpClient 1 bound to 127.0.0.1

10 Oct 2006 00:10:38 INFO: HttpClient 1 direct request for /All

10 Oct 2006 00:10:38 DEBUG: HttpClient 1 cache position: /usr/portage/packages/All

10 Oct 2006 00:10:38 DEBUG: HttpServer 1 received header:

  GET /All HTTP/1.1

  host: localhost:8001

  accept-encoding: identity

10 Oct 2006 00:10:38 DEBUG: HttpClient 1 received header:

  HTTP/1.1 404 Not Found

  date: Tue, 10 Oct 2006 00:10:38 GMT

  connection: close

10 Oct 2006 00:10:38 DEBUG: HttpClient 1 closed

10 Oct 2006 00:10:38 STAT: HttpClient 1 received 56 bytes

There are, however, plenty of packages in /usr/portage/packages/All, so why I should get this message (reproduced below) is not clear to me. Please help!

Fetching binary packages info...

date: Tue, 10 Oct 2006 00:13:56 GMT

connection: close

<html><body><pre><h1>Not found</h1></pre></body></html>

address: /All

Traceback (most recent call last):

  File "/usr/bin/emerge", line 4049, in ?

    emerge_main()

  File "/usr/bin/emerge", line 4044, in emerge_main

    myopts, myaction, myfiles, spinner)

  File "/usr/bin/emerge", line 3452, in action_build

    myopts, myparams, spinner)

  File "/usr/bin/emerge", line 675, in __init__

    trees["/"]["bintree"].populate(

  File "/usr/lib/portage/pym/portage.py", line 5434, in populate

    self.remotepkgs = getbinpkg.dir_get_metadata(

  File "/usr/lib/portage/pym/getbinpkg.py", line 450, in dir_get_metadata

    filelist = dir_get_list(baseurl, conn)

  File "/usr/lib/portage/pym/getbinpkg.py", line 295, in dir_get_list

    raise Exception, "Unable to get listing: %s %s" % (rc,msg)

Exception: Unable to get listing: 404 Server did not respond successfully (404: Not Found)

----------

## golding

 *RichieB wrote:*   

> As an aside, why would I need http_proxy to be lowercase when the other variables in make.conf are upper case?

 I have no idea why, I just know I tried my system with the variable in upper case today and it stopped working.  I put it back to lower case and it resumed working.  I also tried changing the port (to 8001 and relevant client changes to keep everything clean) and this also stopped it working, except on the server, until I put everything back to vanilla.

My attitude tends to be:

If in doubt (I usually am), or if you don't know what you are doing (ditto), follow the manual to the last dotted 'i' and crossed 't'.

Change NOTHING! 

Once it is working, then you might be able to tweak things for other variables, but not before.

----------

## flybynite

 *RichieB wrote:*   

> 
> 
>  Please note that the http-replicator port has been set to 8001, as the client is running an HTTP proxy of its own for internet access. 
> 
>  On oddity you may notice is, in fact, that the server is running within a chroot environment on the client machine. 
> ...

 

The problem is most likely specific to a problem in your unusual configuration.  Http-replicator just works so your having a CHROOT problem, not a replicator problem, and most likely not worth trying to fix in this temporary chroot situation????

If you want to try and fix it anyway, there are two separate problems.  

1.  is http-replicator working for normal package downloads?

NO,  because of the "HTTP_PROXY" your using.  Portage uses wget to download packages unless you've changed it.  wget only responds to "http_proxy" in lower case as you've already been told.  Check the man page for wget or try

```

$ HTTP_PROXY=http://bogus wget http://www.gentoo.org

--04:29:57--  http://www.gentoo.org/

           => `index.html'

Resolving www.gentoo.org... 38.99.64.201, 66.219.59.46, 66.241.137.77

Connecting to www.gentoo.org|38.99.64.201|:80... connected.

HTTP request sent, awaiting response... 200 OK

Length: 14,358 (14K) [text/html]

100%[======================================================================================================================================>] 14,358        71.76K/s

04:29:57 (71.74 KB/s) - `index.html' saved [14358/14358]

$ http_proxy=http://bogus wget http://www.gentoo.org

--04:30:13--  http://www.gentoo.org/

           => `index.html.1'

Resolving bogus... failed: Name or service not known.

```

You see this code works with HTTP_PROXY=bogus  because wget IGNORES it.  The command fails as it should with the lower case http_proxy.  This error means that http-replicator isn't used by your client for portage downloads and can be verified because the server /var/cache/http-replicator dir will be empty or not updated with each client emerge.

If by some chance it is being used, it is probably caused by a conflict with the proxy setup your using to access the net or some weird chroot interaction!!

2.  Your trying to get BINHOST working which is the error you posted.

replicator should just return a web page file listing.  Point your browser to:

```

http://localhost:8001/All

```

It should return a listing of your /usr/portage/packages/All directory.  If it doesn't, its because that dir isn't available to replicator probably because of the chroot setup or a permission problem.  

Even if the dir were empty replicator would return an empty dir listing, not the "not found" error your getting.  Check the dir permissions and also /var/log/http-replicator.log for any possible clues to the error.

Also check the simple stuff.

1. Make sure that since your running the server in the chroot, you didn't start http-replicator before chroot'ing or you started it outside the chroot?

2. How is the internet proxy configured?  Any routing or ipchains rules in effect?

3. Are there any other portions of the howto you chose to ignore?

----------

## RichieB

flybynite et al: thanks [b]so[/b] much for your replies.

Of course, I changed HTTP_PROXY to http_proxy. And, of course, to no avail!   :Rolling Eyes: 

What I found [i]very[/i] interesting, although I cannot explain it, is if I change the client's make.conf from

PORTAGE_BINHOST="http://localhost:8001/All"

to

PORTAGE_BINHOST="http://localhost:8001"

With this alteration, I no longer get the "Not found" (404) message from http-replicator running on the chroot 'server' environment. At that point, emerge -uvaG is met with a "there are no packages available.." error, which I suppose is what you'd expect.

From another machine, I can browse to http://<chroot-http-replicator-ip-address>:8001 and see a list of .bz2's. But if I go to http://<chroot...>:8001/All then http-replicator responds with the "Not found" message above.

So, clearly, http-replicator [b]is[/b] working, but there is something awry with its access to the various directories ... changing DAEMON_OPTS=" ... --alias /usr/portage/packages/All:All" to "/usr/portage/packages:All" in the chroot server's http-replicator.conf seems to make no difference at all.

----------

## flybynite

 *RichieB wrote:*   

> 
> 
> Of course, I changed HTTP_PROXY to http_proxy. And, of course, to no avail!
> 
> 

 

Yes, this won't fix the BINHOST problem but it did fix the problem you aren't aware of yet.  It will make sure http-replicator is actually caching packages.  Did you check to see if any packages were in /var/cache/http-replicator?

Were there any clues in replicators log?

----------

## RichieB

Hi, fly, I appreciate your time!

Yes, there are over 600 packages in /var/cache/http-replicator.

I find it confusing that if I set PORTAGE_BINHOST to http://localhost:8001/ then the emerge process kind of works - but rather than emerge binary builds it emerges source and tries to compile it. Resetting it to the HOWTO suggested http://localhost:8001/All/ produces the Not Found error. Interestingly, the http-replicator log shows:

11 Oct 2006 14:04:10 STAT: HttpClient 22 bound to 127.0.0.1

11 Oct 2006 14:04:10 INFO: HttpClient 22 direct request for /All/

11 Oct 2006 14:04:10 DEBUG: HttpClient 22 cache position: /usr/portage/packages/All/index.html

11 Oct 2006 14:04:10 DEBUG: HttpServer 22 received header:

  GET /All/ HTTP/1.1

  host: localhost:8001

  accept-encoding: identity

11 Oct 2006 14:04:10 DEBUG: HttpClient 22 received header:

  HTTP/1.1 404 Not Found

  date: Wed, 11 Oct 2006 14:04:10 GMT

  connection: close

11 Oct 2006 14:04:10 DEBUG: HttpClient 22 closed

11 Oct 2006 14:04:10 STAT: HttpClient 22 received 56 bytes

And, of course, there is no /usr/portage/packages/All/index.html on the http-replicator server. This is the problem - but what is causing it? I believe my configuration files (above) are correct for a binary repository ... I'm not sure, but I think this shows the client is OK but the error *is* with the server ...

----------

## flybynite

 *RichieB wrote:*   

> 
> 
> Yes, there are over 600 packages in /var/cache/http-replicator.
> 
> 

 

Is the chroot a complete filesystem?  I mean is there a /var/cache/http-replicator in the host and the chroot or is there only one such dir on the box?

 *RichieB wrote:*   

> 
> 
> I find it confusing that if I set PORTAGE_BINHOST to http://localhost:8001/ then the emerge process kind of works - but rather than emerge binary builds it emerges source and tries to compile it. Resetting it to the HOWTO suggested http://localhost:8001/All/ produces the Not Found error. Interestingly, the http-replicator log shows:
> 
> 

 

The reason it works is because without the "ALL" your browsing replicators cache.  So we know that dir is available to replicator.  With the "ALL" your browsing the .../packages/All dir which replicator can't enter/list for some reason.

So replicator probably can't enter or list the packages dir.  Check the permissions on the dir and also enter the chroot with a shell and try and list the dir.

How do you have the outbound net proxy setup.  Is it ipchains or a ENV variable?

Show me the output of  :

```

env

```

----------

## depontius

Is there a tool to clean up the http-replicator cache?

I've done a quick search through (not all 20 pages) this thread, Gento Forums, and Google to see if there's a tool to do this, but nothing has jumped out. A while back I found a thing called distclean.py that cleans stale files out of /usr/portage/distfiles. /var/cache/http-replicator tends to have the same problem, of accumulating every version of every package that has ever been installed, anywhere on the cluster.

Obviously distclean.py has an easier time, in that it can check machine inventory and then delete stale files from that same machine's disfiles. An idea tool would accept package lists from each client machine, as well as its own package list, and then clean packages out of the cache that are not associated with that combined list. 

Does such a tool exist?

Assuming someone (me?) has to try tweaking distclean.py, can someone point me to a tutorial for special-purpose ssh key management? It seems to me that each client could generate its own package list, and use scp to move it to a dedicated directory/file, and the server would take it from there. To run this under cron, it seems a good idea to use ssh with dedicated-purpose keys.

----------

## flybynite

 *depontius wrote:*   

> Is there a tool to clean up the http-replicator cache?
> 
> 

 

No dedicated tool for the reasons you thinking about.  You would have to know what every client needs.  Not only that, but you would have to choose from the many different ways of trimming the cache.  Would you expire based on time, version in use, version in portage, newest version, etc,etc. 

This is the reason I haven't tried to make my own cleaner.  The basic cleaners such as distclean.py will work if you just point them to replicator cache.  Others have used simple bash scripts to delete files based on age or keeping the cache to a max size and deleting the oldest files.

----------

## RichieB

My apologies, flybynite, for not replying sooner. Have been a touch ill. Your continued help is very much appreciated.

 *flybynite wrote:*   

>  *RichieB wrote:*   
> 
> Yes, there are over 600 packages in /var/cache/http-replicator.
> 
>  
> ...

 

Yes, it's a complete filesystem, from the root down. It only shares the 'virtual' filesystems (/proc etc) with the non-chroot environment.

 *flybynite wrote:*   

> 
> 
>  *RichieB wrote:*   
> 
> I find it confusing that if I set PORTAGE_BINHOST to http://localhost:8001/ then the emerge process kind of works - but rather than emerge binary builds it emerges source and tries to compile it. Resetting it to the HOWTO suggested http://localhost:8001/All/ produces the Not Found error. Interestingly, the http-replicator log shows:
> ...

 

drwxrwxr-x 2 root portage 2344 Oct 11 01:53 All (/usr/portages/packages/All)

And the files within are chown root:portage and chmod 755

 *flybynite wrote:*   

> 
> 
> How do you have the outbound net proxy setup.  Is it ipchains or a ENV variable?
> 
> Show me the output of  :
> ...

 

server 'chroot' env:

```

MANPATH=/usr/local/share/man:/usr/share/man:/usr/share/binutils-data/i686-pc-linux-gnu/2.16.1/man:/usr/share/gcc-data/i686-pc-linux-gnu/3.4.6/man::/opt/blackdown-jdk-1.4.2.03/man:/usr/qt/3/doc/man

SHELL=/bin/bash

TERM=linux

HUSHLOGIN=FALSE

OLDPWD=/

QTDIR=/usr/qt/3

USER=root

GDK_USE_XFT=1

PAGER=/usr/bin/less

CONFIG_PROTECT_MASK=/etc/terminfo /etc/revdep-rebuild

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/3.4.6:/opt/blackdown-jdk-1.4.2.03/bin:/opt/blackdown-jdk-1.4.2.03/jre/bin:/usr/kde/3.5/sbin:/usr/kde/3.5/bin:/usr/qt/3/bin

MAIL=/var/mail/root

PWD=/etc

JAVA_HOME=/opt/blackdown-jdk-1.4.2.03

EDITOR=/bin/nano

JAVAC=/opt/blackdown-jdk-1.4.2.03/bin/javac

QMAKESPEC=linux-g++

KDEDIRS=/usr

JDK_HOME=/opt/blackdown-jdk-1.4.2.03

SHLVL=3

HOME=/root

G_FILENAME_ENCODING=UTF-8

LESS=-R -M --shift 5

PYTHONPATH=/usr/lib/portage/pym

LOGNAME=root

CVS_RSH=ssh

GCC_SPECS=

CLASSPATH=.

LESSOPEN=|lesspipe.sh %s

INFOPATH=/usr/share/info:/usr/share/binutils-data/i686-pc-linux-gnu/2.16.1/info:/usr/share/gcc-data/i686-pc-linux-gnu/3.4.6/info

OPENGL_PROFILE=nvidia

G_BROKEN_FILENAMES=1

CONFIG_PROTECT=/usr/share/X11/xkb /usr/kde/3.5/share/config /usr/kde/3.5/env /usr/kde/3.5/shutdown /usr/share/config

_=/usr/bin/env

```

... and the 'client' non-chroot env:

```

MANPATH=/usr/local/share/man:/usr/share/man:/usr/share/binutils-data/i686-pc-linux-gnu/2.16.1/man:/usr/share/gcc-data/i686-pc-linux-gnu/3.4.6/man::/opt/blackdown-jdk-1.4.2.03/man:/usr/qt/3/doc/man

SHELL=/bin/bash

TERM=linux

HUSHLOGIN=FALSE

QTDIR=/usr/qt/3

OLDPWD=/root

USER=root

GDK_USE_XFT=1

PAGER=/usr/bin/less

CONFIG_PROTECT_MASK=/etc/terminfo /etc/revdep-rebuild

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/3.4.6:/opt/blackdown-jdk-1.4.2.03/bin:/opt/blackdown-jdk-1.4.2.03/jre/bin:/usr/kde/3.5/sbin:/usr/kde/3.5/bin:/usr/qt/3/bin

MAIL=/var/mail/root

PWD=/etc

JAVA_HOME=/opt/blackdown-jdk-1.4.2.03

EDITOR=/bin/nano

JAVAC=/opt/blackdown-jdk-1.4.2.03/bin/javac

QMAKESPEC=linux-g++

KDEDIRS=/usr

JDK_HOME=/opt/blackdown-jdk-1.4.2.03

SHLVL=1

HOME=/root

G_FILENAME_ENCODING=UTF-8

LESS=-R -M --shift 5

PYTHONPATH=/usr/lib/portage/pym

LOGNAME=root

CVS_RSH=ssh

GCC_SPECS=

CLASSPATH=.

LESSOPEN=|lesspipe.sh %s

INFOPATH=/usr/share/info:/usr/share/binutils-data/i686-pc-linux-gnu/2.16.1/info:/usr/share/gcc-data/i686-pc-linux-gnu/3.4.6/info

OPENGL_PROFILE=nvidia

G_BROKEN_FILENAMES=1

CONFIG_PROTECT=/usr/share/X11/xkb /usr/kde/3.5/share/config /usr/kde/3.5/env /usr/kde/3.5/shutdown /usr/share/config

_=/usr/bin/env

```

----------

## RichieB

Hi! So sorry, but ...bump ?  :Smile: 

----------

## Underdone

Under the client /etc/make.conf you have this *Quote:*   

>  SYNC="rsync://localhost/gentoo"
> 
> HTTP_PROXY="http://localhost:8001"
> 
> PORTAGE_BINHOST="http://localhost:8001/All"

  Shouldn't it be the server's ip address and not local host since this is the client?

----------

## bunkacid

 *depontius wrote:*   

> Is there a tool to clean up the http-replicator cache?
> 
> ...
> 
> Obviously distclean.py has an easier time
> ...

 

I've used distclean.py version 0.2 to clean out the downloaded sources before the eclean tool came along.

```

# as root.

DISTDIR=/usr/portage/.http-replicator/ ./distclean.py

```

This works for me, obviously you will have to change the paths to where you keep your files.

You should be able to use the eclean utility which comes with gentoolkit-0.2.2

Unfortunately because eclean will not use the command line ENVIRONMENT VARIABLES as emerge does, you will have to change your make.conf temporarily to point to the location(s) you wish to clean.  Or even hardcore the changes into the eclean utility, which I wouldn't recommend.

Perhaps we file a bug to allow for eclean to have the ability to read the ENVIRONMENT VARIABLES added, which would solve a lot.

 *depontius wrote:*   

> ...
> 
> Assuming someone (me?) has to try tweaking distclean.py, can someone point me to a tutorial for special-purpose ssh key management? It seems to me that each client could generate its own package list, and use scp to move it to a dedicated directory/file, and the server would take it from there. To run this under cron, it seems a good idea to use ssh with dedicated-purpose keys.

 

I'm not sure about the automatically updating a master server of all the packages that are being used throughout a network, but I would suggest you start with trying out the ~/.ssh/authorized_keys for some automation tasks

This is a good walkthrough on howto setup the keys.

http://geekpit.blogspot.com/2006/04/five-minutes-to-more-secure-ssh.html

----------

## hielvc

RichieB and any others who useing a chroot you have to point to your chroots packages directory not the servers package dirtectory unless thats where you are copying your packages to. My chroot is /home/gentoo and my packages dir for it is /home/gentoo/usr/packages ( dont ask me why I put it there ) so in my /etc/conf.d/http-replicator I have DAEMON_OPTS="$DAEMON_OPTS --alias /home/gentoo/usr/packages/All:All" and it works fine. That is when I can get portage to use it   :Evil or Very Mad: 

----------

## naim

i keep getting this message, i checked, the partition i assigned to /var/cache/http-replicator is 50% full, with 5 Gb free.

/dev/sda7              11G  5.1G  5.2G  50% /var/cache/http-replicator

i switched to verbose logging and i found the following in the logs::

 *Quote:*   

> Traceback (most recent call last):
> 
>   File "/usr/lib/python2.4/asyncore.py", line 69, in read
> 
>     obj.handle_read_event()
> ...

 

and on client side :: 

 *Quote:*   

> Calculating dependencies... done!
> 
> >>> Emerging (1 of 4) dev-perl/Archive-Zip-1.16 to /
> 
> >>> Downloading 'http://192.168.20.251:8083/distfiles/Archive-Zip-1.16.tar.gz'
> ...

 

My /etc/conf.d/httpreplicator.conf

 *Quote:*   

> 
> 
> GENERAL_OPTS="--dir /var/cache/http-replicator"
> 
> GENERAL_OPTS="$GENERAL_OPTS --user portage"
> ...

 

Any ideas ??

----------

## bunkacid

 *naim wrote:*   

> i keep getting this message, i checked, the partition i assigned to /var/cache/http-replicator is 50% full, with 5 Gb free.
> 
> /dev/sda7              11G  5.1G  5.2G  50% /var/cache/http-replicator
> 
> i switched to verbose logging and i found the following in the logs::
> ...

 

You've run out of space somewhere that http-replicator is outputting data/logs/information/etc to.

What are the results of;

```
df -m
```

----------

## naim

oops, root partition full, i was looking the wrong way, thanks

----------

## dalek

I have been using this for a while but all of a sudden I get this:

```
root@smoker / # repcacheman

Replicator's cache directory: /var/cache/http-replicator/

Portage's DISTDIR: /usr/portage/distfiles/

Comparing directories....

Done!

Deleting duplicate file(s) in /usr/portage/distfiles/

http-replicator_3.0.tar.gz

Done!

New files in DISTDIR:

seamonkey-1.1-patches-0.1.tar.bz2

mozilla-launcher-1.56.bz2

seamonkey-1.1.1.source.tar.bz2

Checking authenticity and integrity of new files...

Searching for ebuilds...

Done!

Found 22919 ebuilds.

Extracting the checksums....

Done!

Verifying checksum's....

/usr/portage/distfiles/seamonkey-1.1-patches-0.1.tar.bz2

Traceback (most recent call last):

  File "/usr/bin/repcacheman.py", line 203, in ?

    if t["MD5"]:

KeyError: 'MD5'

root@smoker / #

```

I tried reemerging it but I still get the same error.  What is this?  Is this me or something else?

Thanks

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## flybynite

 *dalek wrote:*   

> 
> 
> /usr/portage/distfiles/seamonkey-1.1-patches-0.1.tar.bz2
> 
> KeyError: 'MD5'
> ...

 

This type of error is probably caused by an error in the portage data.  Wait a while, re-sync and see if it goes away.

----------

## dalek

Hmmm, I should have mentioned this has been going on for a week or better.  I have synced three or four times since this started.

I do seem to recall that portage and gentoolkit was upgraded recently.  Could that have something to do with this?  I may go back a version of gentoolkit and try it.  I think that was the last one upgraded and was about the time this started.

Open to any ideas though.  I really need to fix this so I can update my other rigs.

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## dalek

Well, I went back to a older version of portage and gentoolkit and still no joy.  I also synced again.  

Any more ideas?  I need some.     :Wink: 

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## dalek

OK.  This fixed it:

```
mv -v /usr/portage/distfiles/seamonkey-1.1-patches-0.1.tar.bz2 /root/
```

That will work until I need that patch again.  Does nobody have a clue what is up with that?  Is the patch corrupt or something?

Thanks

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## bunkacid

I believe some of the portage Devs are making changes to the Manifest files which could explain the MD5 error.

----------

## dalek

Thanks for the info.  I'm surprised no one else has posted they are getting the same error though.  Maybe it will sort itself out pretty soon.

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## flybynite

 *dalek wrote:*   

> I'm surprised no one else has posted they are getting the same error though. 

 

repcacheman uses portage functions, which means this error could be specific to your portage system or setup.

1.  Does this file belong to a local portage package?  A package that you made yourself or copied from somewhere or maybe even from a portage overlay?

2.  Since you upgraded portage lately,   according to http://www.gentoo.org/proj/en/portage/doc/common-problems.xml , it could be caused by a old version of portage or maybe an incomplete upgrade of portage:

 *Quote:*   

> 
> 
> Receiving
> 
>     portage.db["/"]["porttree"].dbapi.auxdb[porttree_root][cat].clear()
> ...

 

I know you don't have this exact error, but I believe the KeyError seems to be the important part and is the error you reported  :Smile: 

----------

## dalek

I went back to a older version of portage, just to see if it was a change made in portage that just kicked in.  Same thing, even after doing a sync again.  I have since went back to the new version of portage.  I also did the same thing with gentoolkit just in case.

I read the link you posted, since the fix was to reemerge portage, I would hope that downgrading and upgrading would have taken care of that.  The key word is hope though.    :Wink: 

The file did come from emergeing Seamonkey.  It was a recent upgrade.  I don't even know how to do a overlay.  After moving the file as mentioned above, I have not had the error since, even after syncing.

Still open to ideas though.  I can move the file back if needed.  

Thanks

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## bunkacid

Have you tried removing the offending file?

----------

## dalek

Yup, that was how I got rid of the error.  I have ran into corrupt files before though and it would just delete them or not copy them over and go on.  On this one, it made the CPU go to about 100%, sit there for a really loooong time and puke out that error.  It has never took that long to run repcacheman before though.  It just seemed that this file triggered something nasty.

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

edit: forgot the error part.    :Embarassed: 

----------

## flybynite

Hmmm, seems I tried to duplicate the error and I can't even get the same file.

Did the dev's update the ebuild last night or are you using an old version of the patches so you haven't emerge sync'd??

I only have patches 1.1.1 ??

 *Quote:*   

> 
> 
>  repcacheman
> 
> Replicator's cache directory: /var/cache/http-replicator/
> ...

 

----------

## dalek

You may be on to something.  I did a emerge -f seamonkey and it redownloaded the patch, it worked fine when I ran repcacheman.  I would have to guess that the old one is corrupt but how come portage used it to install Seamonkey and not notice that I wonder?

That is weird.  I guess that is why it happened to me.    :Laughing:   :Laughing: 

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## enkryptz

Can anyone hook me up with the latest sources for http replicator?

The webpage seems to be inactive since a long time, there is some news about a 3.1 development version,

but I can only find 3.0 which is quite old.

Any public CVS or SVN repository available where I can follow the development?

----------

## dalek

Try here:

http://cudlug.cudenver.edu/gentoo/distfiles/

and just look for the http-replicator tarball.  It's long list but it's in there.

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## dahoste

Regarding dalek's 'MD5' error while running repcacheman -- you're not the only one!  I *have* been having that exact problem, but it isn't with the seamonkey file, it's with all kinds of different files.  At first, I just did what I usually do with an offending file in distfiles - delete it, but the problem seems pretty rampant right now (at least, with the set of stuff I've gotten lately in distfiles), so I'm looking at it as a more serious problem.

I haven't done any rebuilds of portage or gentoolkit.  That is, as an attempt to fix the problem, I did get the latest portage and gentoolkit updates, but I think the MD5 problem was happening before that anyway.  I've also done multiple syncs and 'emerge -uDav world' over the last week.

Here's hoping it goes away in the near future.  I'm not sure what to try that might help.

----------

## hielvc

Same here. I have had 2 offeneding packages. The second one was 

 *Quote:*   

> Verifying checksum's....
> 
> /usr/portage/distfiles/eix-0.9.1.tar.bz2
> 
> Traceback (most recent call last):
> ...

 

----------

## dalek

This is sort of funny.  I keyworded and unmasked that eix package and did a fetch for it.  I then ran repcacheman.  I got a error but not for the eix package.  I got this:

```
Deleting duplicate file(s) in /usr/portage/distfiles/

eix-0.9.1.tar.bz2

java-config-wrapper-0.12.tar.bz2

javatoolkit-0.2.0.tar.bz2

java-config-1.3.7.tar.bz2

java-config-2.0.31.tar.bz2

Done!

New files in DISTDIR:

j2sdk-1_4_2-doc.zip

jdk-1_5_0-doc-r1.zip

Checking authenticity and integrity of new files...

Searching for ebuilds...

Done!

Found 22902 ebuilds.

Extracting the checksums....

Done!

Verifying checksum's....

/usr/portage/distfiles/j2sdk-1_4_2-doc.zip

Traceback (most recent call last):

  File "/usr/bin/repcacheman.py", line 203, in ?

    if t["MD5"]:

KeyError: 'MD5'

root@smoker / #           
```

So mine was fine with eix where your wasn't.  I synced last night to.  Do you get the same error for j2sdk?

I'm hoping this will help them figure out if it is a local thing specific to us or if it is something else.

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## dahoste

Another bit of info: I just ran repcacheman on one of my other servers (which I hadn't done in awhile - maybe a week or two) and it plowed through everything in distfiles just fine.  The interesting piece is that this server is actually the one that feeds packages to all of my LAN machines -- so it had the same set of files in distfiles that caused the 'MD5' error when running repcacheman on the other machine.  I wonder if that indicates that perhaps the problem might originate when http-replicator serves package files.  The two machines on which I have http-replicator running are as near identical as I can get (that's actually the purpose - one is a backup/test for the other), with the exception that the one (with no 'MD5' errors) sucks packages down from the internet, and the other (with 'MD5' errors) sucks packages down from the first machine.

Too busy right now to run targeted experiments to see if I can narrow down the circumstances surrounding the problem, but if I get a chance, I'll report back here.

----------

## flybynite

 *enkryptz wrote:*   

> Can anyone hook me up with the latest sources for http replicator?
> 
> 

 

The current ebuild is the latest sources.

http://gertjan.freezope.org/replicator/  is the last known web page for Gertjan.  He authored http-replicator specifically for debian and is not a gentoo user.

I worked with him to improve the code and make it gentoo friendly.  I am the author of repcacheman and other gentoo specific code such as the ebuild and config's.

Gertjan, despite my best effort, wanted to completely rewrite the code from scratch rather than refactor and add much needed features.  As far as I know, he never completed any updates.

Are you interested in hacking the code?  Do you need some specific update?

----------

## flybynite

dalek,dahoste,hielvc,

Congrats, you've found a BUG!

It seems there is a change in portage working its way into the portage tree somehow related to GLEP 44.  It changes the way portage stores the checksums of files.

This wouldn't cause any problems by itself since repcacheman only calls the portage database functions, but it seems MD5's are either no longer going to be used or portage dev's are not bothering to add them to the new manifest format anymore.

I'll see what I can find out about the change in portage.

Thanks again!

----------

## hielvc

This thread tlks about it some  VERY undesirable change in portage behavior  :Sad: 

----------

## b1f30

 *hielvc wrote:*   

> This thread tlks about it some  VERY undesirable change in portage behavior 

 

Wow - that is quite a change that I was not aware of. Does this obsolete http-replicator?

Yeesh...

----------

## dahoste

 *Quote:*   

> Wow - that is quite a change that I was not aware of. Does this obsolete http-replicator? 

 

um... I don't see how it would.  http-replicator is a means of proxy-caching gentoo packages.  It doesn't alter, nor particularly care what's in, any of the packages it caches/proxies.  The need that http-replicator satisfies isn't going away.  The 'undesirable change' mentioned in the previous post (and related thread) sounds like a hiccup in the transitional phase of the GLEP44 proposal.  The GLEP44 itself says this:

 *Quote:*   

> It is important to note that this proposal only deals with a change of the format of the digest and Manifest system.

 

So, no, once the dust settles, we'll all appreciate having http-replicator as much as we currently do.   Unless I'm really missing something about all of this.

----------

## b1f30

 *dahoste wrote:*   

>  *Quote:*   Wow - that is quite a change that I was not aware of. Does this obsolete http-replicator?  
> 
> um... I don't see how it would.  http-replicator is a means of proxy-caching gentoo packages.  It doesn't alter, nor particularly care what's in, any of the packages it caches/proxies.  The need that http-replicator satisfies isn't going away.  The 'undesirable change' mentioned in the previous post (and related thread) sounds like a hiccup in the transitional phase of the GLEP44 proposal.  The GLEP44 itself says this:
> 
>  *Quote:*   It is important to note that this proposal only deals with a change of the format of the digest and Manifest system. 
> ...

 

The problem isn't so much http-replicator, but perhaps it's indexing tool, repcacheman, which seems to fail a lot due to not being able to find MD5's for a lot of packages now - at least that's what I've observed from my own experience within the past month now, and from what I've read from previous posts in this thread.

----------

## flybynite

repcacheman uses portage functions to do it's job.   It just needs to call the new portage functions and all is well.  I'm working on the new changes now.  The only real change is It needs to call the sha1 checksum rather than the md5 checksum.

Also, repcacheman isn't really required at all.  Truth is I don't even run it all the time.

It does several things for you that aren't strictly needed to make http-replicator work.

1.  It creates the cache dir upon install.  It does this only upon initial install so I don't have to have the ebuild or users do it.  mkdir /var/cache/http-replicator, chown portage:portage /var/cache/http-replicator if you had to do it manually.

2.  It transfers your initial distdir to the cache dir.  Allows your already populated distdir to populate the cache dir.  Nice, but not required.

3.  It deletes duplicate files only on the http-replicator server.  I just delete all the distdir files on the server.  The most I lose is one or two files that were only available thru FTP.  With a good mirror, I can go months without using FTP at all so I don't loose anything.  Check your logs, how many times has repcacheman actually added new files?

So repcacheman will soon work with the small but increasing number of manifest2 packages.   Until then, http-replicator works just fine

----------

## lorenct

For a while I have had a problem with http-replicator not remaining running after a reboot on my central LAN server.

I would have to SSH in to the system to restart it manual for it remain running.

Then I noticed once when SSH'ng into the system before it was completely up and I saw the http-replicator process (python script) running on tty1. A short time later the process was gone and /sbin/agetty from /etc/inittab process was running on tty1 (when performing a ps -ef command). I know that /etc/init.d/http-replicator script is supposed to run the python script as a daemon, but it is possible that the script is not forking as a daemon fast enough before being "smacked-down" by the /sbin/agetty process and terminating. (I do not know for sure, but that is what it seems like is happening to me).

So I changed the following in /etc/init.d/http-replicator which seemed to resolve the problem for me:

FROM:

```

start() {

        ebegin "Starting Http-Replicator"

        start-stop-daemon --start --pidfile /var/run/http-replicator.pid --name http-replicator \

                --startas /usr/bin/http-replicator -- -s -f --pid /var/run/http-replicator.pid --daemon $DAEMON_OPTS

        eend $? "Failed to start Http-Replicator"

}

```

TO:

```

start() {

        ebegin "Starting Http-Replicator"

        start-stop-daemon --start --background --pidfile /var/run/http-replicator.pid --name http-replicator \

                --startas /usr/bin/http-replicator -- -s -f --pid /var/run/http-replicator.pid --daemon $DAEMON_OPTS

        eend $? "Failed to start Http-Replicator"

}

```

NOTICE: The addition of the --background option to the start case.

Just through I would share this with the rest of you incase you have encountered the same problem...

----------

## dalek

I would also like to add something to the mix, since you are already under the hood fixing things.  I'm on a really slow dial-up and sometimes I have to stop replicator for various reasons.  I notice that when I restart it and restart the emerge process which will try to continue the download, it has trouble reconnecting to replicator.  Replicator will start to download right away as it should but emerge can't seem to get back in sync.

I run into things like this on the really large packages.  If you need me to I can post what I get so you can see it.  

Maybe one of these days I will get DSL or something.  Solve a lot of problems.

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## flybynite

 *dalek wrote:*   

>  I have to stop replicator for various reasons.

 

How are you stopping the download /replicator?  Are you using the init scripts, killing a process or just shutting down the system?

What mirror are you using?

----------

## dalek

I do use the init scripts to stop it.  /etc/init.d/http-replicator stop.  The mirror varies, just whichever one it picks I guess.  It seems to do the same on about all of them though.

I do have a big download tonight if you want me to post what it does, or email it to you over pm.  If I hadn't logged out of KDE I could post you one now.

```
root@smoker / # uptime

 22:01:56 up 26 days, 19:45,  1 user,  load average: 1.10, 1.05, 1.08

root@smoker / #

```

I try not to reboot to much, mostly when the power goes out.

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## Scen

 *lorenct wrote:*   

> Just through I would share this with the rest of you incase you have encountered the same problem...

 

You rule, boy  :Cool: 

I've searched this solution without results for several months!

I'll open ASAP a bug report to fix this issue!

Thanks a lot again!  :Wink: 

----------

## Griffon26

 *Scen wrote:*   

> 
> 
> I've searched this solution without results for several months!
> 
> 

 

Then why didn't you report it before today? I'm the maintainer of http-replicator, but I don't read the forums. When you find a problem, please submit a bug so i know about it.

----------

## Scen

You're right....  :Embarassed: 

/me flagellates himself   :Razz: 

I've read your e-mail, i send you more information ASAP  :Cool: 

----------

## flybynite

 *dalek wrote:*   

> I do use the init scripts to stop it.  /etc/init.d/http-replicator stop.  The mirror varies

 

OK, that eliminates some possible problems.  Then I'll have to say I'm afraid you've found a  "feature", not a bug.  Well, actually a lack of a feature  :Sad: 

Scen and Griffon26, I'll save you some time chasing this down!

The feature is that replicator supports resuming on the client end but not on the internet end.  If replicator is stopped or killed it will delete the partial download from the cache!!!!  Think about that for a second and your problem will be clear.

Here is how to test it.  Start a long fetch, (any file will probably do on dialup  :Smile:   openoffice-bin for the rest of us) and then shutdown http-replicator before the download is finished.  Then look and see if the partial file is in the cache.  It won't be there, but portage will keep the partial download in the distfile dir!!

Then you restart replicator and then restart your long fetch.  Replicator will try and honor portage's request to resume the download but it has to start the download from the beginning over the net!!   This means portage will probably timeout or you will give up before the download from the net catches up with the partial download portage still has.

So, in it's current level of development, http-replicator is not designed for nor can it handle all situations, shutting it down in the middle of a download is not a supported use at this time (developers needed).  Resuming on the net side along with FTP support was a planned part of the next release of http-replicator.  Unfortunately, that release was bogged down in a total rewrite.

dalek to suit your dialup needs, it would be better to bypass replicator for the interrupted long download, then move the file in the cache when complete.   Just do  this

```

http_proxy="" emerge -f longdownload

```

restarting is as often as necessary and when it completes

```

repcacheman

```

----------

## dalek

OK.  The solution you posted is what I have been doing, sort of.  My biggest problem is my ISP puts a 6 hour limit on unlimited access.  Go figure that one out.    :Rolling Eyes:    I have been editing make.conf but it does the same thing I guess.

Think maybe one day this will be fixed  :Question:    I did see some phone trucks the other day.  We may be going to get DSL soon.    :Very Happy:    Parden me while I go jump around like a idiot at even the thought of broadband.    :Laughing: 

 :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy:   :Very Happy: 

----------

## flybynite

I've waded through the depths of portage, and returned with a new version of repcacheman.

I'm continuing to develop and add new features, but now is a good time to test and get some feedback from users.  Since I have new features still in development, I chose an older revision to test, but it should work much better than the previous version and have all the old features.

My tests show this beta 4.0 uses only 10% of the resident memory of the previous version and runs 4 times faster!!

It can be run from any dir as root or dropped in place of the old /usr/bin/repcacheman.py ( not /usr/bin/repcacheman )

The code can be downloded from this temp location:

http://home.earthlink.net/~poplawtm/rep4.py.tar.gzLast edited by flybynite on Tue May 15, 2007 12:39 am; edited 2 times in total

----------

## mkzelda

On the first run, it fails and the directory it creates is not where I specified in /etc/conf.d/http-replicator.

Begin Http-Replicator Setup....

        created /var/cache/http-replicator/

Traceback (most recent call last):

  File "/usr/bin/repcacheman", line 73, in ?

    print "\tchange owner " + dir + " to " + user + " failed:"

NameError: name 'user' is not defined

When I run repcacheman again, it works properly other than ignoring my desired directory in the conf

!!! Digest verification failed:

!!! /usr/portage/distfiles/gem_plugin-0.2.2.gem

!!! Reason: Failed on RMD160 verification

!!! Got: 9715b571202ebe33d72bfd6384305b555da6f2b6

!!! Expected: 4759f2ccb75081ebe46ffffc3ad5c7ba2e20c3bc

!!! Digest verification failed:

!!! /usr/portage/distfiles/file-4.20.tar.gz

!!! Reason: Filesize does not match recorded size

!!! Got: 548412

!!! Expected: 548393

!!! Digest verification failed:

!!! /usr/portage/distfiles/mongrel-1.0.1.gem

!!! Reason: Filesize does not match recorded size

!!! Got: 159232

!!! Expected: 160256

SUMMARY:

Found 0 duplicate file(s).

        Deleted 0 dupe(s).

Found 1103 new file(s).

        Added 1040 of those file(s) to the cache.

Rejected 60 File(s) not in Portage.

Oops, I did that on the wrong machine. 

The second results are the same. The first attempt fails, rep4.py ignores my desired cache dir, and it picked up a few more bad files. Now, I'd be happy if it'd just use my ftp pub dir instead of /var/cache/http-replicator.

--update

I also have to set the conf back to /var/cache/http-replicator for the time being or http-replicator fails.

----------

## flybynite

 *mkzelda wrote:*   

> On the first run, it fails and the directory it creates is not where I specified in /etc/conf.d/http-replicator.
> 
> 

 

Thanks for the bug report mkzelda,  I've fixed the 'user' is not defined problem.  

Now did you run repcacheman from a dir or did you copy over your old /usr/bin/repcacheman?

----------

## mkzelda

i replaced /usr/bin/repcacheman

----------

## flybynite

 *mkzelda wrote:*   

> i replaced /usr/bin/repcacheman

 

Ooops, I'm sorry.  I've corrected my post above but I meant to say replace /usr/bin/repcacheman.py

/usr/bin/repcacheman just calls /usr/bin/repcacheman with the correct options which is why you had the other problems.

Either re-emerge http-replicator which won't disturb your config or edit /usr/bin/repcacheman to look like this

```

#! /bin/bash

source /etc/conf.d/http-replicator

/usr/bin/repcacheman.py $GENERAL_OPTS

```

and replace /usr/bin/repcacheman.py with the beta script.

Again, sorry for the inconvenience.

I've uploaded beta 4.1 with two typo's fixed.  There is still something going on with the core code.  Right now I think it is a filename collision in portage itself.  I didn't change the download link but you will see rep41.py inside.

----------

## mkzelda

Okay, that worked, with verbose output of the portage tree. Is there a trigger to avoid the verbosity? My server performs slower when outputting scrolling text.

I'm wondering how files in the cache are treated. Are they assumed to be good, and thus unchecked? For example, if I have an overlay on another machine that my server does not, the files it grabs are stored in the cache as they are fetched, and they remain there indefinitely? So, can I put any files in the cache that my client machines might fetch, such as livecd .iso's, and so long as the client used wget with http_proxy specified it can fetch that locally?

----------

## flybynite

 *mkzelda wrote:*   

> Okay, that worked, with verbose output of the portage tree. Is there a trigger to avoid the verbosity? My server performs slower when outputting scrolling text.
> 
> 

 

The verbose output is just my debugging going on, it won't be in the final version.  I've uploaded beta revision 4.3 that removes the scrolling and fixes the problem in the core code I mentioned earlier.

 *mkzelda wrote:*   

> 
> 
> I'm wondering how files in the cache are treated. Are they assumed to be good, and thus unchecked? For example, if I have an overlay on another machine that my server does not, the files it grabs are stored in the cache as they are fetched, and they remain there indefinitely? So, can I put any files in the cache that my client machines might fetch, such as livecd .iso's, and so long as the client used wget with http_proxy specified it can fetch that locally?

 

Thanks for asking!  I've been trying to decide some possible options to add and who might need them.  I also want the greatest possible options for users.

replicator is a general purpose proxy at heart,  It will serve and cache anything that goes through it, even web browsing.  There is an "alias" option to serve files from a dir of your choice in addition to the cache.  It defaults to serving BINARY packages from gentoo's default location but you can add to or replace that default.

/etc/conf.d/http-replicator

```

## Local dir to serve clients.  Great for serving binary packages

## See PKDIR and PORTAGE_BINHOST settings in 'man make.conf'

## --alias /path/to/serve:location will make /path/to/serve

## browsable at http://http-replicator.com:port/location

DAEMON_OPTS="$DAEMON_OPTS --alias /var/tmp/packages/All:All"

```

So if you want to serve random files you can keep them in a separate dir for easy management  by fetching them with the alias url or keep them in the cache and fetch them with the http_proxy setting.  Multiple alias options are allowed.  Http-replicator was designed to be a secure, high performance web server with a cache.

replicator doesn't check its own cache for this reason.  It won't touch anything in it's cache because it may contain user files.

The question is should replicator check it's cache?

I say no right now because it can be done better by other means.  But adding that feature would be convenient for many users?

1.  If replicator is a gentoo only cache, there are other distfile checking scripts that will delete files  based on many tests such as not in portage, not the most current version, older than a certain date, exceed a maximum cache size, not accessed in the last 3 months, etc etc.

2.  If replicator is used for other files I can't even guess how to prune the cache.

What I do is this.  It could be a cron script but I do it manually by choice.

```

mv /var/cache/http-replicator/* /var/tmp/distfiles/

repcacheman

rm -rf /var/tmp/distifles/*

```

This moves the cache files to the distfile dir.  This is fast because it only renames the files, it doesn't move anything on disk.

repcacheman runs which moves all good files back to the cache.

then I delete all the remaining files which are not in portage or corrupt/incomplete.

You could also move the files, run the distfile cleaning script to prune based on your desires, then run repcacheman!

There was a time when distfile cleaning scripts were hard to find, now eclean is part of gentoolkit.

I know that was probably more than you wanted to know but I hope it helped you and some lurkers  :Smile: 

----------

## dahoste

flybynite: does your new beta version address the MD5 problem when computing checksums?

----------

## flybynite

 *dahoste wrote:*   

> flybynite: does your new beta version address the MD5 problem when computing checksums?

 

Yes, the new version is fully portage manifest2 compliant and is much faster than the previous version.

----------

## golding

flybynite

Some time ago (early '06 I think) I posted here that http-replicator would be started in the rc init scripts, but when I went to emerge anything I had to restart it.  This behaviour has remained until yesterday.

Before then I was using a login manager of varying types from gdm to xdm and even the Enlightenment greeter, but yesterday I decided I had had enough and wanted to properly secure my lan by using proper console login procedures.

Surprise, surprise!  Suddenly http-replicator did not have to be re-started after login, now it works without that annoying restart before I emerge anything.

I do not know if this is a bug, however, I thought you might like to know.

----------

## flybynite

 *golding wrote:*   

> flybynite
> 
> I do not know if this is a bug, however, I thought you might like to know.

 

Yes, I remember  :Smile: 

A bug was filed similar to yours (I don't think you filed it) , but I could never reproduce it.  Please check if you can help maurice here

https://bugs.gentoo.org/show_bug.cgi?id=177428

----------

## BernieKe

Quick question: is it ok for me to set the http-replicator cache to /usr/portage/distfiles?

----------

## flybynite

no.

----------

## BernieKe

That's what I thought, but I had this working for the past few days like this, and everything seemed ok. (It happened by accident by the way, I discovered what I'd done after already having emerged a number of packages.)

But I now got why it was working, apparently I had commented out the http_proxy setting on the machine that's running http-replicator.

Considering that the replicator host doesn't use the replicator, and only serves the distfiles directory out, would there still be an issue with a setup like this?

It seems to make life easier, removing the need for double writes, and repcacheman (or manual alternatives.)

If you could explain why the above would be a bad thing to do (if it actually is), I'd appreciate it.

Thanks,

Bernie

----------

## flybynite

Scroll up a couple of posts and see the link for the new, improved version of repcacheman that is much faster and much less of a memory hog and actually works with the latest portage.

 *BernieKe wrote:*   

> Considering that the replicator host doesn't use the replicator, and only serves the distfiles directory out, would there still be an issue with a setup like this?
> 
> 

 

All boxes should have http_proxy set to point to replicators cache, even the box hosting replicator.  Anything less means the cache isn't being used and maximized. 

 *BernieKe wrote:*   

> 
> 
> If you could explain why the above would be a bad thing to do (if it actually is), I'd appreciate it.
> 
> 

 

The short answer is portage is a bad neighbor.  It leaves half downloaded and corrupt files laying around the distfile dir and as a bonus leaves those junk files owned by root.  Http-replicator will serve those corrupt, incomplete files to other clients because it doesn't checksum the files.  replicator doesn't do the checksums because it isn't gentoo specific plus it streams files to clients as they are received.  It couldn't even try to do checksums till the whole file was received which means it couldn't simultaneously stream files as they are downloaded.  repcacheman is gentoo specific and does the checksums but only does this when requested, not continuously.

repcacheman deletes dups, and imports files to the cache that pass the checksum test, if any.  Test your system out and see how often it actually has to do checksums in actual use.  I haven't had to do checksums in many months.  This isn't true for all parts of the world, some areas have better ftp mirrors closer to them.  Just make sure you have a full, complete set of http mirrors defined and no ftp mirrors if you can.  Portage will still download by ftp even if no ftp mirrors are listed in GENTOO_MIRRORS.

checksums are kinda expensive to do, but rarely happen for most users.  If you didn't mind losing the mostly rare ftp downloads, you could just rm -rf /usr/portage/distfiles on the server.  But some users are on dialup and take days to download openoffice etc.  rm -rf will lose the partial download, repcacheman won't.

----------

## BernieKe

Thanks a lot for the clear reply!

I've installed the updated repcacheman, and everything works fine once again.

I do however have one small patch for you, in order to also ignore git and mercurial sources.

```

93c93

< dc=filecmp.dircmp (distdir,dir,['cvs-src','git-src','hg-src','.locks'])

---

> dc=filecmp.dircmp (distdir,dir,['cvs-src','.locks'])

```

----------

## flybynite

 *BernieKe wrote:*   

> 
> 
> I do however have one small patch for you, in order to also ignore git and mercurial sources.
> 
> 

 

Ignoring git sounds good, I'm not familiar with mercurial but since you asked I added it also.   Thanks!

----------

## neosimago

I'm having troubles with the repcacheman python script. Re-installing python doesn't fix the problem, and other python script seems to work fine. Here's the output: 

```
Checking authenticity and integrity of new files...

Searching for ebuilds...

Done!

Found 25230 ebuilds.

Extracting the checksums....

Done!

Verifying checksum's....

/usr/portage/distfiles/GDM-FlyAway.tar.gz

Traceback (most recent call last):

  File "/usr/bin/repcacheman.py", line 203, in ?

    if t["MD5"]:

KeyError: 'MD5'
```

more of the output can be found at : http://rafb.net/p/FmewYB48.html

=> I have tried removing the problem files to no avail, and supposedly the script it designed to handle them anyways. I'll check in on the forums to see if this gets fixed later. --kudos!Last edited by neosimago on Thu Nov 22, 2007 6:46 pm; edited 1 time in total

----------

## flybynite

 *neosimago wrote:*   

> I'm having troubles with the repcacheman python script. Re-installing python doesn't fix the problem, and other python script seems to work fine. Here's the output: 
> 
> 

 

fixed a long time ago but not updated by the gentoo maintainer.

See this post

https://forums.gentoo.org/viewtopic-t-173226-postdays-0-postorder-asc-start-539.html

----------

## neosimago

Thanks a bunch! I'm using your 4.3 beta of repcacheman and it's working out fine. 

 *flybynite wrote:*   

>  *neosimago wrote:*   I'm having troubles with the repcacheman python script. Re-installing python doesn't fix the problem, and other python script seems to work fine. Here's the output: 
> 
>  
> 
> fixed a long time ago but not updated by the gentoo maintainer.
> ...

 

----------

## neosimago

flybynite:

I'm using a gentoo local server with squid running. Will this be conflicting or redundant with http-replicator also installed? I'm not running into any problems now, and i hope not in the future, so I'll keep you posted on this issue if anything shows up. 

I would also like to add Apt-get ubuntu package and source mirrors to my gentoo box, because I run a mixed environment. How would I do that? I haven't been able to find a good source on line to do that. Apparently the changes in /etc/make.conf for http_proxy doesn't work the same way as it does in ubuntu distros. More insight as to how http-replicator works would help definately in this situation. I don't suppose repcacheman would be of any help in an Apt-get http source mirror, because it's gentoo specific right? Any thoughts about writing a py script for Apt-get mirrors? 

And, yes the rep4.py script does run faster with less resouces. Keep up the good work!

----------

## flybynite

 *neosimago wrote:*   

> 
> 
> I'm using a gentoo local server with squid running. Will this be conflicting or redundant with http-replicator also installed?
> 
> 

 

Squid doesn't know anything about portage so it is actually very inefficient with portage.  When I was looking for a cache I first tried squid and was totally unsatisfied so that helped spur me on to seeing replicator developed.  Don't get me wrong, squid is good at some things, just not with portage.

 *neosimago wrote:*   

> 
> 
> I don't suppose repcacheman would be of any help in an Apt-get http source mirror, because it's gentoo specific right?

 

Actually it's the opposite.  replicator was developed first as a debian apt-get style cache with some general purpose http caching as well.  I worked with the developer gertjan to add gentoo specific features later when I was looking to develop a better cache and not wanting to start from scratch.

I've not run debian style in a long time so forgive me if I miss something, but removing the -s and -f options from /etc/init.d/http-replicator should return replicator to a debian style cache.  There may be other options that are helpful, check the man page, but I know these two specific options are only usable with gentoo.

The problem is, without those two options, replicator doesn't work well with portage and suffers some of the same problems as squid, namely the miss rate skyrockets.

So I'd bet you would have to run two instances of http-replicator on different ports to serve both gentoo and debian style clients.  That setup should would work well and be very efficient  :Smile: 

----------

## neosimago

flybynite

thanks for the reply. I'm looking through the man pages for http-replicator, and tried the home page: http://gertjan.freezope.org/replicator <=apparently i'm getting a message of bad gateway trying to access only that particular page. I would very much like to find more information about the best ways to run two instances of replicator, one for gentoo, which i have working well now, and one for ubuntu's apt-get. I suppose i could re-create another /etc/init.d/http-replicator script to start under another name, but that would be like having a rogue package loose on my system. What would be the best way to port replicator available on the ubuntu platform so that it provides the same functions on a gentoo system for apt-caching?

----------

## flybynite

 *neosimago wrote:*   

>  I suppose i could re-create another /etc/init.d/http-replicator script to start under another name, but that would be like having a rogue package loose on my system. What would be the best way to port replicator available on the ubuntu platform so that it provides the same functions on a gentoo system for apt-caching?

 

What kind of box is it ubuntu or gentoo?  If gentoo use /usr/local/portage to create a duplicate package changing the ebuild to not conflict.  You could make the new version 99 and use slotting for example.

This way the package isn't rogue....

There is some info about ebuilds here http://www.gentoo.org/proj/en/devrel/handbook/handbook.xml?part=2&chap=1#doc_chap2

If ubuntu, sorry, I can't help you.

----------

## neosimago

 *flybynite wrote:*   

>  *neosimago wrote:*    I suppose i could re-create another /etc/init.d/http-replicator script to start under another name, but that would be like having a rogue package loose on my system. What would be the best way to port replicator available on the ubuntu platform so that it provides the same functions on a gentoo system for apt-caching? 
> 
>  *Quote:*   
> 
> What kind of box is it ubuntu or gentoo?  If gentoo use /usr/local/portage to create a duplicate package changing the ebuild to not conflict.  You could make the new version 99 and use slotting for example. 
> ...

 

----------

## myceliv

 *neosimago wrote:*   

> flybynite
> 
> .. tried the home page: http://gertjan.freezope.org/replicator <=apparently i'm getting a message of bad gateway trying to access only that particular page. 

 

Hopefully you can view that page. Looks like it shouldn't be too hard to do what you want, and there are specific tips on that page about 'apt-cacher' (which is in synaptic, so you could look at how it's configured) and gertjan site also talks about using 'http-replicator' with debs. Quick search of Ubuntu Forums yields nothing. I'm surprised. Bet if you start a thread there you'll get sorted in no time.

----------

## uprooter

Is there a way to tell the replicator only to download and save locally???

I need this because I share my portage tree using NFS and I want a single point of download for all of my servers.

it's a big waste if I download using the replicator, the data are sent to the client and then the client send back the file to the server using the NFS.

----------

## flybynite

 *uprooter wrote:*   

> Is there a way to tell the replicator only to download and save locally???
> 
> 

 

Yes, stop using NFS  :Smile: 

replicator and nfs do mostly the same thing and you only need one or the other in most cases.  Are you trying to do something special?

----------

## uprooter

The difference is that with NFS i'm holding only one copy of the portage tree on my LAN.

with http-replicator I need to have a local portage tree on each machine.

Fix me if I'm wrong.

----------

## flybynite

 *uprooter wrote:*   

> The difference is that with NFS i'm holding only one copy of the portage tree on my LAN.
> 
> with http-replicator I need to have a local portage tree on each machine.
> 
> Fix me if I'm wrong.

 

Your not really wrong, but there is a way.

Just move the packages and the distfile dir out from under the "portage" dir by adding this to /etc/make.conf

```

PKGDIR=/var/tmp/packages

DISTDIR=/var/tmp/distfiles

```

Now you can NFS share the ebuilds etc. and only sync once but still have replicator manage and share binary packages and distfiles.  This will prevent the double network copy you were asking about.

This is similar to what I use.  I didn't like duplicating portage on all my boxes and laptops, it was a waste of space.  I also didn't like not being able to upgrade or add packages while the server was down or my laptop was on the road so I needed each box to have a copy of portage.

A long time ago I found https://forums.gentoo.org/viewtopic-t-401647.html.  I now use a variation of this system.  I can compress portage down to ~47MB and copy this to my laptops and other boxes using replicator.  I have a custom script that does this for me.

For some more Ideas the compressed portage idea see https://forums.gentoo.org/viewtopic-t-465367.html or if you want to know more about my simple system just ask.

----------

## uprooter

```
PKGDIR=/var/tmp/packages

DISTDIR=/var/tmp/distfiles 
```

Not really helpful since the distfiles are now stored on each machine.

I want only one copy of the tree and the distfiles (which is heavier  than the tree in my case) on the LAN.

I`m talking about 20 old desktops/servers that some of them have 6GB IDE/SCSI HDD.

well,, 

I can implement this on my own with a little perl daemon that listens for incoming connections ans spawn wget for each URL request it gets from the clients, then configure the clients to have a special FETCHCOMMAND in make.conf that submits the URL to that perl daemon.

Just thought that the http-replicator is able to do it since it's already part of it's job.

----------

## flybynite

 *uprooter wrote:*   

> 
> 
> Not really helpful since the distfiles are now stored on each machine.
> 
> I want only one copy of the tree and the distfiles (which is heavier  than the tree in my case) on the LAN.
> ...

 

all your distfiles don't need to be stored at each machine.  Only the files needed for the package that your are emerging are downloaded and they only need to be there long enough for the emerge.  After the update, delete them with rm -rf /var/tmp/distifles.

emerging is much faster when the files are local and not over the network anyway.

I still don't understand what you want.  Why use replicator at all in your case?  Is there some advantage to you that I don't see?

----------

## dbishop

From the http-replicator server: no packages will download via http, they always fail and then fetch using ftp. Same thing happens from clients, http fails but ftp succeeds. If I comment out the http_proxy line in the client /etc/make.conf file, the same files will always download via http.  Same is true for the http-replicator server.

```
>>> Emerging (1 of 2) x11-libs/goffice-0.6.2 to /

>>> Downloading 'http://gentoo.osuosl.org/distfiles/goffice-0.6.2.tar.bz2'

--22:53:36--  http://gentoo.osuosl.org/distfiles/goffice-0.6.2.tar.bz2

           => `/usr/portage/distfiles/goffice-0.6.2.tar.bz2'

Resolving local.mydomain.com... 192.168.0.243

Connecting to local.mydomain.com|192.168.0.243|:8090... connected.

Proxy request sent, awaiting response... 400 Bad Request

22:53:36 ERROR 400: Bad Request.

>>> Downloading 'http://distfiles.gentoo.org/distfiles/goffice-0.6.2.tar.bz2'

--22:53:36--  http://distfiles.gentoo.org/distfiles/goffice-0.6.2.tar.bz2

           => `/usr/portage/distfiles/goffice-0.6.2.tar.bz2'

Resolving local.mydomain.com... 192.168.0.243

Connecting to local.mydomain.com|192.168.0.243|:8090... connected.

Proxy request sent, awaiting response... 400 Bad Request

22:53:36 ERROR 400: Bad Request.

>>> Downloading 'http://www.ibiblio.org/pub/Linux/distributions/gentoo/distfiles/goffice-0.6.2.tar.bz2'

--22:53:36--  http://www.ibiblio.org/pub/Linux/distributions/gentoo/distfiles/goffice-0.6.2.tar.bz2

           => `/usr/portage/distfiles/goffice-0.6.2.tar.bz2'

Resolving local.mydomain.com... 192.168.0.243

Connecting to local.mydomain.com|192.168.0.243|:8090... connected.

Proxy request sent, awaiting response... 400 Bad Request

22:53:36 ERROR 400: Bad Request.
```

After this, the ftp fetching begins and succeeds, but of course the cache is not properly updated.  If the client ftp's the files, they do not end up on the server at all; if the server ftp's the files they end up in /usr/portage/distfiles/ as expected and can be moved with repcacheman, and then the client will get them from the http-replicator cache a desired. tcpdump shows that the requests are going to the http-replicator server properly.

I'm sure it's something obvious, but I'm just missing it. Any help would be appreciated...

I should mention that everything works perfectly if the files are already in /var/cache/http-replicator/ and I don't have any issues with repcacheman.

Edited to add:

It seems that http-replicator is addressing the proxied http request to the default gateway rather than the true destination server.  The local client x.x.x.11 uses the http_proxy x.x.x.243 which will return the file properly if it is in it's cache. If it is not there, the http_proxy server (x.x.x.243) asks x.x.x.1 (the default gateway) for the file.  Any requests made by the http_proxy machine itself (as opposed to requests made through it from other clients) are properly fetched and placed in it's local distfiles cache, and repcacheman works properly on them, and then the x.x.x.11 client will fetch properly from the http_proxy machine (x.x.x.243).

I suspect it's something I've misconfigured, but I'm just not seeing it. Any help is greatly appreciated.

----------

## flybynite

How about showing your config's on the server and the clients?

----------

## melinux

I've been running http-replicator for a while. Now I started getting:

```

/usr/lib/portage/pym/portage_manifest.py:39: DeprecationWarning: DEPRECATION NOTICE: The portage_manifest module was replaced by portage.manifest

/usr/lib/portage/pym/portage_checksum.py:39: DeprecationWarning: DEPRECATION NOTICE: The portage_checksum module was replaced by portage.checksum

/usr/lib/portage/pym/portage_exception.py:39: DeprecationWarning: DEPRECATION NOTICE: The portage_exception module was replaced by portage.exception

```

Is there going to be any related update?

----------

## flybynite

I seem to remember fixing this a while ago, I guess it might be changing again?  What version of replicator and portage are you running?

----------

## melinux

 *flybynite wrote:*   

> I seem to remember fixing this a while ago, I guess it might be changing again?  What version of replicator and portage are you running?

 

portage version 2.1.6.4

and http-replicator version 3.0-r1 (~x86)

Thanks,

melinux

----------

## flybynite

 *melinux wrote:*   

> 
> 
> portage version 2.1.6.4
> 
> 

 

Portage switched from portage_manifest to portage.manifest in version 2.2-r6 or greater and repcacheman is fixed for that version.  Now the change seems to have been backported to at least your portage version 2.1.6.4

Could you check to see if the current fix works for you to confirm this?

Just cp the file /usr/portage/net-proxy/http-replicator/files/http-replicator-3.0-repcacheman-0.44-r1 somewhere and run it instead of the system version.

Here is some code if you need it.

```

mkdir /root/repcachemantest

cd /root/repcachemantest

cp  /usr/portage/net-proxy/http-replicator/files/http-replicator-3.0-repcacheman-0.44-r1 .

./http-replicator-3.0-repcacheman-0.44-r1

```

----------

## jongeek

 *flybynite wrote:*   

> 
> 
> Could you check to see if the current fix works for you to confirm this?
> 
> Just cp the file /usr/portage/net-proxy/http-replicator/files/http-replicator-3.0-repcacheman-0.44-r1 somewhere and run it instead of the system version.
> ...

 

I was having the same warnings with portage 2.1.6.7. I ran the repcacheman script as you described, and then it did not display the warnings.

----------

## melinux

 *jongeek wrote:*   

>  *flybynite wrote:*   
> 
> Could you check to see if the current fix works for you to confirm this?
> 
> Just cp the file /usr/portage/net-proxy/http-replicator/files/http-replicator-3.0-repcacheman-0.44-r1 somewhere and run it instead of the system version.
> ...

 

At present I can't test that pc as its not working (hardware - overheating problems) right now...

I installed the current http-replicator and I have no problems as yet, on another machine. Thanks anyway.

----------

## Onkl

I have set up http-replicator to also serve as PORTAGE_BINHOST. Works as advertised.

However, when I use

```

quickpkg --include-config=y <package>

```

to make a package with my adapted settings to my LAN, the package gets permissions of 600, which http-replicator will not serve. Changing the permissions to 644 solves this, but it is a hassle that I forget to do most of the time.

Is there a way to make http-replicator also serve those files, or make quickpkg to set the correct permissions?

----------

## flybynite

 *Onkl wrote:*   

> Is there a way to make http-replicator also serve those files, or make quickpkg to set the correct permissions?

 

```
h2 ~ # quickpkg --help

  --umask=UMASK         umask used during package creation (default is 0077)

```

Quickpkg's default umask of 0077 denies all users except root from even reading the file.  I can't explain why it does that.   Replicator can't change or work around such drastic file permissions. 

There must be many others who feel that default umask is too drastic because Quickpkg does give you the option to change the umask to a more normal 022.

Try

```

h2 ~ # quickpkg --umask=022  nano

h2 ~ # ls -l /var/tmp/packages/All/nano*

-rw-r--r-- 1 root root 202145 2009-12-03 20:53 /var/tmp/packages/All/nano-2.1.10.tbz2

```

Now replicator can read the file and serve it to others.

You can make this easy by putting

```
alias quickpkg='quickpkg --umask=022'
```

 in your ~/.bashrc

----------

## Onkl

Thanks a lot.

And since an "emerge -b" builds the package with 644 it is indeed strange that quickpkg is so restrictive.

----------

## sam_i_am

Hi,

Thanks for this great program. I'm using at home and at work without any issues except:

On one host (x86), the http-replicator doesn't start. It says "failed to start service". Log says "HttpReplicator started", but it seems to die. I enabled debug option, but no other error messages. However, on a different host (x86_64) on the same network, it works with the exact same config parameters.

Any tips on how to go about finding out why?

Other than the being x86, the failing host also runs apache on port 80 and 443, but port 8080 is clear.

Sam

----------

## knight77

Hi there.

The problem is still there (http-replicator-3.0-r2 stable in portage x86).

I filed a bug on it: https://bugs.gentoo.org/show_bug.cgi?id=339079

There is also another thread here about this problem, workaround included: https://forums.gentoo.org/viewtopic-t-787761.html

----------

## Havin_it

Thought this might be worth a mention, even though it's a bit non-standard...

I've used http-replicator for years, and I've just realised I wasn't doing so in the prescribed way. I've never even been aware of repcacheman until today, in fact. I never liked the idea of the distfiles ending up duplicated on the server, so I do things this way:

 Set http-replicator's cache dir to the same as my $DISTDIR

 Local clients use the proxy, server machine does not

The upshot is that when clients fetch a distfile, it goes in (or is already in, and gets served from) the cache via http-replicator; when the server fetches, the files are saved in the same dir and http-replicator will serve these for local clients as well, even though it didn't fetch them personally.

This seems to have worked fine for me over the years, though I'd welcome any drawbacks to this approach that y'all veterans might be able to identify. Hope it helps someone.

----------

## depontius

I had http-replicator fail to come up a week or two ago, also.  The problem was that the lockfile is at /var/log/http-replicator/http-replicator.pid, and the portage user didn't have write access.

A while back Linux adopted a tmpfs-based /run directory, and /var/run became a symlink to /run.  When it was in /var/run it was persistent, and the emerge process could chown /var/run/http-replicator to portage, and it would stick.  Now that /run is tmpfs, it's recreated on every boot, and such things don't stick.  The fix would be to do a chown against /run/http-replicator in the initscript, prior to changing to the portage userid.

The problem I have is that this doesn't universally fail, so I don't understand what's going on.  A few days ago that system lost power, and when I powered it back up, I made a mental note about http-replicator, since I hadn't actually changed the initscript.  Then I forgot.  Upon seeing this thread, I checked that system and http-replicator is running, even though it doesn't seem that it should be.  Moreover, now the lockfile is /run/http-replicator.pid, owned by root, instead of /run/http-replicator/http-replicator.pid, owned by portage.  

On another system where I'm running http-replicator it's still on the deeper path.  Looking into the initscripts, the locations I see for the lockfile are correct, yet they're all 3.0-r3 - I don't quite get it.

----------

## khayyam

 *depontius wrote:*   

> A while back Linux adopted a tmpfs-based /run directory, and /var/run became a symlink to /run.  When it was in /var/run it was persistent, and the emerge process could chown /var/run/http-replicator to portage, and it would stick.  Now that /run is tmpfs, it's recreated on every boot, and such things don't stick.  The fix would be to do a chown against /run/http-replicator in the initscript, prior to changing to the portage userid.

 

depontius ... when this migration happend openrc intoduced an implimentation of systemd's tmpfiles.d. This implimentation is 100% compatable with the above linked manpage, though openrc doesn't create the configuration dir /etc/tmpfiles.d on install.

```
# equery files =sys-apps/openrc-0.11.8 |grep tmpfiles

/etc/conf.d/tmpfiles

/etc/init.d/tmpfiles.setup

/lib/rc/sh/tmpfiles.sh

/usr/share/openrc/runlevels/boot/tmpfiles.setup
```

So, using the above, and setting tempfiles.setup in the runlevel, you should be able have /run/http-replicator created with the correct ownership, etc.

HTH & best ... khay

----------

## depontius

Thanks for the info.  I'd seen stuff on /etc/tmpfiles.d as a systemd thing, but didn't realize that OpenRC had it, as well.

What bothers me more is that right now is that I have 2 systems running http-replicator-3.0-r3, both x86.  One has /run/http-replicator.pid and the other has /run/http-replicator/http-replicator.pid - in the initscript.

----------

## Havin_it

Just wondering, is it possible/supported to run multiple instances of http-replicator? I've got another caching-proxying task it might be useful for, but I want to use a different cache dir for it.

Possible?

----------

## mbar

Has anybody the same problem as me (~amd64)? http-replicator-4.0_alpha2

```
/etc/init.d/http-replicator start

 * Caching service dependencies ...                                                                                                                                                     [ ok ]

 * Starting Http-Replicator ...

Traceback (most recent call last):

  File "/usr/bin/http-replicator", line 4, in <module>

    import Params, Request, Response, fiber, weakref

ImportError: No module named Params

 * start-stop-daemon: failed to start `/usr/bin/http-replicator'

 * Failed to start Http-Replicator                                                                                                                                                      [ !! ]

 * ERROR: http-replicator failed to start

```

----------

## ToeiRei

same problem here. Trying to track it down.

Update 1:

Looks like a Namespace / Path problem to me as removing 'Params' did make it complain about the next missing module

Solved by downgrading to net-proxy/http-replicator-3.0-r3

----------

## TomWij

Please report bugs at Bugzilla, we don't keep track of bugs on the forum. Thanks.

 *Quote:*   

> +  05 Jun 2013; Tom Wijsman <TomWij@gentoo.org>
> 
> +  +http-replicator-4.0_alpha2-r1.ebuild:
> 
> +  Revision bump, added missing Python modules. Fixes bug #472122 reported by
> ...

 

----------

## jpc22

ive read the whole thread and a lot of other documentation for hours     and i cant get the damn thing to work

CLIENT make.conf

http_proxy="http://192.168.1.138:8080"

PORTAGE_BINHOST="192.168.1.138:8080/usr/portage/packages"         ####(tried all possible variations, ip whitout port , /   /usr /all... )

CLIENT ~ # emerge -g jfsutils

--2014-12-08 18:07:57--  http://192.168.1.138:8080/usr/portage/packages/Packages

Connecting to 192.168.1.138:8080... connected.

HTTP request sent, awaiting response... No data received.

Retrying.

--2014-12-08 18:07:58--  (try: 2)  http://192.168.1.138:8080/usr/portage/packages/Packages

Connecting to 192.168.1.138:8080... connected.

HTTP request sent, awaiting response... No data received.

Retrying.

--2014-12-08 18:08:00--  (try: 3)  http://192.168.1.138:8080/usr/portage/packages/Packages

Connecting to 192.168.1.138:8080... connected.

HTTP request sent, awaiting response... No data received.

Giving up.

Fetcher exited with a failure condition.

!!! Error fetching binhost package info from '192.168.1.138:8080/usr/portage/packages'

!!! FETCHCOMMAND_192.168.1.138 failed

----------

## Havin_it

It appears you're trying to use http-replicator as a regular HTTP server to serve up your binpkgs, which won't work: it is only a proxy, and does nothing unless you have a proper server running to serve the files from. Http-replicator simply caches a copy of the file as you download it from the proper server (or serves its cached copy if it's already present). Either way, you need to be able to fetch the file without http-replicator, or you will not be able to do so with it.

So you need to install apache, lighttpd or another http server and configure it to serve your binpkgs. And if that server is on your LAN, then there's nothing to gain by using http-replicator, and you'll just end up with two copies of every file on your binhost: one wherever your http server is serving it from, and one in the cache.

----------

## flybynite

I know this is an old post, but the previous post has been possibly misleading new users for too long already.

 *Quote:*   

> It appears you're trying to use http-replicator as a regular HTTP server to serve up your binpkgs, which won't work: it is only a proxy, and does nothing unless you have a proper server 
> 
> 

 

This is incorrect - http-replicator will serve binpkgs by itself and doesn't require installing apache,lighttpd, or any other http server.  It is way more than just a simple proxy.

http-replicator continues to work just fine and I'm glad gentoo has kept it updated.

----------

## Thistled

This is really beginning to annoy me now.

I'm beginning to think the new layout for portage, i.e /var/db/repos/gentoo and /var/cache/distfiles

is breaking the replicator.

```
 000C   Accepted request from [192.168.1.2]:34974

  000C   Waiting at 16: RECV(4,20:03:29)

  000C   Client sends GET /distfiles/layout.conf HTTP/1.1

  000C   Error: invalid url: /distfiles/layout.conf

[ IDLE ] Sun Feb  2 20:03:14 2020

[ BUSY ] Sun Feb  2 20:03:15 2020

[ 000D ] Sun Feb  2 20:03:15 2020

  000D   Accepted request from [192.168.1.2]:34976

  000D   Waiting at 16: RECV(4,20:03:30)

  000D   Client sends GET /distfiles/layout.conf HTTP/1.1

  000D   Error: invalid url: /distfiles/layout.conf

[ IDLE ] Sun Feb  2 20:03:15 2020

[ BUSY ] Sun Feb  2 20:03:17 2020

[ 000E ] Sun Feb  2 20:03:17 2020

  000E   Accepted request from [192.168.1.2]:34978

  000E   Waiting at 16: RECV(4,20:03:32)

  000E   Client sends GET /distfiles/layout.conf HTTP/1.1

  000E   Error: invalid url: /distfiles/layout.conf

[ IDLE ] Sun Feb  2 20:03:17 2020

[ BUSY ] Sun Feb  2 20:03:17 2020

[ 000F ] Sun Feb  2 20:03:17 2020

  000F   Accepted request from [192.168.1.2]:34980

  000F   Waiting at 16: RECV(4,20:03:32)

  000F   Client sends GET /distfiles/libgweather-3.32.2.tar.xz HTTP/1.1

  000F   Error: invalid url: /distfiles/libgweather-3.32.2.tar.xz

[ IDLE ] Sun Feb  2 20:03:17 2020

```

The above is what I am seeing in my logs.

This is the output of emerge:

```
>>> Emerging (1 of 1) dev-libs/libgweather-3.32.2-r1::gentoo

>>> Downloading 'http://pig2:8080/distfiles/layout.conf'

--2020-02-02 20:03:14--  http://pig2:8080/distfiles/layout.conf

Resolving pig2... 192.168.1.4

Connecting to pig2|192.168.1.4|:8080... connected.

HTTP request sent, awaiting response... No data received.

Retrying.

--2020-02-02 20:03:15--  (try: 2)  http://pig2:8080/distfiles/layout.conf

Connecting to pig2|192.168.1.4|:8080... connected.

HTTP request sent, awaiting response... No data received.

Retrying.

--2020-02-02 20:03:17--  (try: 3)  http://pig2:8080/distfiles/layout.conf

Connecting to pig2|192.168.1.4|:8080... connected.

HTTP request sent, awaiting response... No data received.

Giving up.

!!! Couldn't download '.layout.conf.pig2'. Aborting.

>>> Downloading 'http://pig2:8080/distfiles/libgweather-3.32.2.tar.xz'

--2020-02-02 20:03:17--  http://pig2:8080/distfiles/libgweather-3.32.2.tar.xz

Resolving pig2... 192.168.1.4

Connecting to pig2|192.168.1.4|:8080... connected.

HTTP request sent, awaiting response... No data received.

Retrying.

```

It finally pulls in libgweather from a mirror hosting gnome packages.

The thing is, libgweather definitely resides in /var/cache/distfiles, and it is also in /var/cache/http-replicator.

I think the version 4 alpha of replicator has changed, as some of the properties in the config file (/etc/conf.d/http-replicator) no longer work.

My setup...

/var/db/repos/gentoo is an NFS share on my main server, all gentoo clients use this share.

I'm trying to get each client to request a distfile from http-replicator on the server, but the replicator is 

suggesting the requested url is invalid.

Any idea?

----------

## Gatak

I did an alternative setup. I exported my distfiles over samba with a password so that each client can fetch and update the central repository. This also works with the portage tree itself, if you like. This way only one client needs to run six-sync, and all others just do eix-update.

----------

## Thistled

 *Gatak wrote:*   

> I did an alternative setup. I exported my distfiles over samba with a password so that each client can fetch and update the central repository. This also works with the portage tree itself, if you like. This way only one client needs to run six-sync, and all others just do eix-update.

 

Yes, mines was very similar.

The NFS server hosted/shared  /usr/portage and /usr/portage/distfiles. This was mounted as /mnt/nfs_portage on all clients. 

The server sync'ed using rsync against mirrors.

Therefore the clients just had to eix-update and their databases were updated.

All clients could install / update packages without issue. When a package is called it would be downloaded into the distfiles folder.

I considered the shared /usr/portage/distfiles folder to be the central repository of source files for all clients on the network.

With this approach, I was saving considerable disk space on each client, as there was no need to store distfiles locally.

The problem I now have is the migration away from /usr/portage and /usr/portage/distfiles, to /var/db/repos/gentoo and /var/cache/distfiles respectively.

This is where I now have an opportunity to move the distfiles away from the server share, and hopefully use http-replicator to dispatch the packages to each client when needed.

The server is still hosting /usr/portage > /var/db/repos/gentoo to all clients who continue to access via /mnt/nfs_portage.

The server does the syncing as usual to the mirrors, and all the clients need to do is eix-update.

The problem lies with the clients requesting source files (distfiles) via http-replicator. The replicator will not dispatch the file, even though it exists.

Here is what I see from the http-replicator log:

```
 [ IDLE ] Mon Feb  3 16:58:02 2020

[ BUSY ] Mon Feb  3 16:58:02 2020

[ 0001 ] Mon Feb  3 16:58:02 2020

  0001   Accepted request from [192.168.1.2]:37812

  0001   Waiting at 16: RECV(4,16:58:17)

  0001   Client sends GET http://pig2:8080/distfiles/libgweather-3.32.2.tar.xz HTTP/1.1

  0001   Switching to HttpProtocol

  0001   Cache position: libgweather-3.32.2.tar.xz

  0001   Requesting address info for pig2:8080

  0001   Connecting to [127.0.0.1]:8080

  0001   Waiting at 36: SEND(5,16:58:17)

[ 0002 ] Mon Feb  3 16:58:02 2020

  0001   Waiting at 39: RECV(5,16:58:17)

  0002   Accepted request from [127.0.0.1]:49830

  0002   Waiting at 16: RECV(6,16:58:17)

  0002   Client sends GET /distfiles/libgweather-3.32.2.tar.xz HTTP/1.1

  0002   Error: invalid url: /distfiles/libgweather-3.32.2.tar.xz

  0001   Switching to ExceptionResponse

  0001   Traceback (most recent call last):

  0001     File "/usr/lib/python-exec/python2.7/http-replicator", line 40, in Replicator

  0001       protocol.recv( server )

  0001     File "/usr/lib/python2.7/site-packages/Protocol.py", line 149, in recv

  0001       assert chunk, 'server closed connection before sending a complete message header'

  0001   AssertionError: server closed connection before sending a complete message header

  0001   Waiting at 52: SEND(4,16:58:17)

  0001   Transaction successfully completed

[ IDLE ] Mon Feb  3 16:58:02 2020

```

I get the impression python is borking out because a request is being made for a distfiles folder at the specified URL. ??

Is this because of the new behaviour portage has adopted. i.e. layout.conf, distfiles, __download__ etc etc?

If so, then http-replicator should be marked as unstable, as it is not working.

Does anyone actually have http-replicator working out of the box today?

I do observe this particular forum post has gone very quiet, is it because the replicator just doesn't work now and is no longer maintained?

----------

## unheatedgarage

 *Thistled wrote:*   

> Does anyone actually have http-replicator working out of the box today?
> 
> I do observe this particular forum post has gone very quiet, is it because the replicator just doesn't work now and is no longer maintained?

 

It's a real shame The Replicator doesn't work anymore.

I'm running systemd and it hasn't been able to start for...maybe a year now? Anyway, I removed it a long time ago, and will be looking into using Samba or Squid in the future, unless someone steps up and reignites it.

It's a beautiful program that worked flawlessly for a long, long time, and I miss it.

Let's all keep being good netizens!

----------

