Send patches - preferably formatted by git format-patch - to patches at archlinux32 dot org.
summaryrefslogtreecommitdiff
path: root/lib/libalpm/dload.c
AgeCommit message (Collapse)Author
2020-07-14Check that destfile_name exists before using itAnatol Pomozov
In some cases (when trust_remote_name is used for a URL without a filename and no Content-Disposition is provided by the server) destfile_name will be NULL. In this case payload data will be stored in tempfile_name and no destfile_name is set. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-14Do not free payload fields in the middle of this structure useAnatol Pomozov
At the end of payload use it calls _alpm_dload_payload_reset() that will free() these and other fields anyway. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-14Build signature remote name based on the main payload nameAnatol Pomozov
The main payload final name might be affected by url redirects or Content-Disposition HTTP header value. We want to make sure that accompanion *.sig filename always matches the package filename. So ignore finalname/Content-Disposition for the *.sig file. It also helps to fix a corner case when the download URL does not contain a filename and server provides Content-Disposition for the main payload but not for the signature payload. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-07FS#33992: force download *.sig file if it does not exist in the cacheAnatol Pomozov
In case if *.pkg exists but *.sig file does not we still have to pass the pkg to multi_download API. To avoid redownloading *.pkg file we use CURLOPT_TIMECONDITION curl option. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-07Move signature payload creation to download engineAnatol Pomozov
Until now callee of ALPM download functionality has been in charge of payload creation both for the main file (e.g. *.pkg) and for the accompanied *.sig file. One advantage of such solution is that all payloads are independent and can be fetched in parallel thus exploiting the maximum level of download parallelism. To build *.sig file url we've been using a simple string concatenation: $requested_url + ".sig". Unfortunately there are cases when it does not work. For example an archlinux.org "Download From Mirror" link looks like this https://www.archlinux.org/packages/core/x86_64/bash/download/ and it gets redirected to some mirror. But if we append ".sig" to the end of the link url and try to download it then archlinux.org returns 404 error. To overcome this issue we need to follow redirects for the main payload first, find the final url and only then append '.sig' suffix. This implies 2 things: - the signature payload initialization need to be moved to dload.c as it is the place where we have access to the resolved url - *.sig is downloaded serially with the main payload and this reduces level of parallelism Move *.sig payload creation to dload.c. Once the main payload is fetched successfully we check if the callee asked to download the accompanied signature. If yes - create a new payload and add it to mcurl. *.sig payload does not use server list of the main payload and thus does not support mirror failover. *.sig file comes from the same server as the main payload. Refactor event loop in curl_multi_download_internal() a bit. Instead of relying on curl_multi_check_finished_download() to return number of new payloads we simply rerun the loop iteration one more time to check if there are any active downloads left. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-06-26Cleanup the old sequential download codeAnatol Pomozov
All users of _alpm_download() have been refactored to the new API. It is time to remove the old _alpm_download() functionality now. This change also removes obsolete SIGPIPE signal handler functionality (this is a leftover from libfetch days). Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
2020-06-26Convert '-U pkg1 pkg2' codepath to parallel downloadAnatol Pomozov
Installing remote packages using its URL is an interesting case for ALPM API. Unlike package sync ('pacman -S pkg1 pkg2') '-U' does not deal with server mirror list. Thus _alpm_multi_download() should be able to handle file download for payloads that either have 'fileurl' field or pair of fields ('servers' and 'filepath') set. Signature for alpm_fetch_pkgurl() has changed and it accepts an output list that is populated with filepaths to fetched packages. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
2020-05-09Convert download packages logic to multiplexed APIAnatol Pomozov
Create a list of dload_payloads and pass it to the new _alpm_multi_* interface. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09Implement multibar UIAnatol Pomozov
Multiplexed download requires ability to draw UI for multiple active progress bars. To implement it we use ANSI codes to move cursor up/down and then redraw the required progress bar. `pacman_multibar_ui.active_downloads` field represents the list of active downloads that correspond to progress bars. `struct pacman_progress_bar` is a data structure for a progress bar. In some cases (e.g. database downloads) we want to keep progress bars in order. In some other cases (package downloads) we want to move completed items to the top of the screen. Function `multibar_move_completed_up` allows to configure such behavior. Per discussion in the maillist we do not want to show download progress for signature files. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09Extend download callback interface with start/complete eventsAnatol Pomozov
With the previous download interface the callback uses the first progress event as 'download has started' signal. Unfortunately it does not work with up-to-date files that never receive 'download progress' events. Up-to-date database messages are currently handled in sync_syncdbs() after the sequential download is completed and a result from ALPM is received. But this is not going to work with multiplexed download interface that returns the result only after all files are completed. Another problem with 'first progress event is the beginning of the download' is that such events time are unpredictable. Thus the UI progress bar order might differ from what has been passed by client to alpm_dbs_update() function. We actually want to keep the dbs progress bars in a strict order. To help to solve the given problems extend the download callback to allow 2 more events - download started and completed. 'Download started' events appear in the same order as in the list given by a client. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09Implement multiplexed download using mCURLAnatol Pomozov
curl_multi_download_internal() is the main loop that creates up to 'ParallelDownloads' easy curl handles, adds them to mcurl and then performs curl execution. This is when the paralled downloads happens. Once any of the downloads complete the function checks its result. In case if the download fails it initiates retry with the next server from payload->servers list. At the download completion all the payload resources are cleaned up. curl_multi_check_finished_download() is essentially refactored version of curl_download_internal() adopted for multi_curl. Once mcurl porting is complete curl_download_internal() will be removed. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09Implement _alpm_multi_downloadAnatol Pomozov
It is an equivalent of _alpm_download but accepts a list of payloads. curl_multi_download_internal() is a stub at this moment and will be implemented in the later commits of this patch series. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09Inline dload_payload->curlerr field into a local variableAnatol Pomozov
dload_payload->curlerr is a field that is used inside curl_download_internal() function only. It can be converted to a local variable. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09Add multi_curl handle to ALPM global contextAnatol Pomozov
To be able to run multiple download in parallel efficiently we need to use curl_multi interface [1]. It introduces a set of APIs over new type of handler 'CURLM'. Create CURLM object at the application start and set it to global ALPM context. The 'single-download' CURL handle moves to payload struct. A new CURL handle is created for each payload with intention to be processed by CURLM. Note that curl_download_internal() is not ported to CURLM interface due to the fact that the function will go away soon. [1] https://curl.haxx.se/libcurl/c/libcurl-multi.html Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09Introduce alpm_dbs_update() function for parallel db updatesAnatol Pomozov
This is an equivalent of alpm_db_update but for multiplexed (parallel) download. The difference is that this function accepts list of databases to update. And then ALPM internals download it in parallel if possible. Add a stub for _alpm_multi_download the function that will do parallel payloads downloads in the future. Introduce dload_payload->filepath field that contains url path to the file we download. It is like fileurl field but does not contain protocol/server part. The rationale for having this field is that with the curl multidownload the server retry logic is going to move to a curl callback. And the callback needs to be able to reconstruct the 'next' fileurl. One will be able to do it by getting the next server url from 'servers' list and then concat with filepath. Once the 'parallel download' refactoring is over 'fileurl' field will go away. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2020-04-13Use GOTO_ERR throughoutAllan McRae
The GOTO_ERR define was added in commit 80ae8014 for use in future commits. There are plenty of places in the code base it can be used, so convert them. Signed-off-by: Allan McRae <allan@archlinux.org>
2020-02-10build-aux/update-copyright 2019 2020Allan McRae
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-01-28Docs docs docsmorganamilo
libalpm: move docs from .c files into alpm.h And fix/expand some along the way. Signed-off-by: Allan McRae <allan@archlinux.org>
2020-01-27Fix "pacman -U <url>" operationsAllan McRae
Commit e6a6d307 detected complete part files by comparing a payload's max_size to initial_size. However, these values are also equal when we use pacman -U on a URL as max_size is set to 0 in that case. Add a further condition to avoid that. Signed-off-by: Allan McRae <allan@archlinux.org>
2020-01-07Use c99 struct initialization to avoid memset callsDave Reisner
This is guaranteed less error prone than calling memset and hoping the human gets the argument order correct.
2019-11-15Handle .part files that are the size of the correct packageAllan McRae
In rare cases, likely due to a well timed Ctrl+C, but possibly due to a broken mirror, a ".part" file may have size at least that of the correct package size. When encountering this issue, currently pacman fails in different ways depending on where the package falls in the list to download. If last, "wrong or NULL argument passed" error is reported, or a "invalid or corrupt package" issue if not. Capture these .part files, and remove the extension. This lets pacman either use the package if valid, or offer to remove it if it fails checksum or signature verification. Signed-off-by: Allan McRae <allan@archlinux.org>
2019-10-23Update copyright yearsAllan McRae
make update-copyright OLD=2018 NEW=2019 Signed-off-by: Allan McRae <allan@archlinux.org>
2019-10-07dload: never return NULL from get_filenameDave Reisner
Downloads with a Content-Disposition header will typically not include slashes. When they do, we should most certainly only take the basename, but when they don't, we should treat the header value as the filename. Crash introduced in d197d8ab82cf when we started using get_filename in order to rightfully avoid an arbitrary file overwrite vulnerability. Signed-off-by: Allan McRae <allan@archlinux.org>
2019-06-28Correctly report a download failiure for 404smorganamilo
Currently when caling alpm_trans_commit, if fetching a package restults in a 404 (or other non 400 response code), the function returns -1 but errno is never set. This patch sets errno to ALPM_ERR_RETRIEVE. Signed-off-by: Allan McRae <allan@archlinux.org>
2019-03-01Sanitize file name received from Content-Disposition headerAndrew Gregory
When installing a remote package with "pacman -U <url>", pacman renames the downloaded package file to match the name given in the Content-Disposition header. However, pacman does not sanitize this name, which may contain slashes, before calling rename(). A malicious server (or a network MitM if downloading over HTTP) can send a content-disposition header to make pacman place the file anywhere in the filesystem, potentially leading to arbitrary root code execution. Notably, this bypasses pacman's package signature checking. For example, a malicious package-hosting server (or a network man-in-the-middle, if downloading over HTTP) could serve the following header: Content-Disposition: filename=../../../../../../usr/share/libalpm/hooks/evil.hook and pacman would move the downloaded file to /usr/share/libalpm/hooks/evil.hook. This invocation of "pacman -U" would later fail, unable to find the downloaded package in the cache directory, but the hook file would remain in place. The commands in the malicious hook would then be run (as root) the next time any package is installed. Discovered-by: Adam Suhl <asuhl@mit.edu> Signed-off-by: Allan McRae <allan@archlinux.org>
2019-02-07libalpm: prevent 301 redirect loop from hanging the processMark Ulrich
If a mirror responds with a 301 redirect to itself, it will create an infinite redirect loop. This will cause pacman to hang, unresponsive to even a SIGINT. The result is pacman being unable to sync or download any package from a particular repo if its current mirror is stuck in a redirect loop. Setting libcurl's MAXREDIRS option effectively prevents a redirect loop from hanging the process. Signed-off-by: Mark Ulrich <mark.ulrich.86@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2018-10-17alpm: Fix SIGINT handling re: aborting downloadOlivier Brunel
Upon receiving SIGINT a flag is set to abort the (curl) download. However, since it was never reset/initialized, if a front-end doesn't actually exit on SIGINT, and later tries any operation that needs to perform a new download, said download would always get aborted right away due to the flag not having been reset.
2018-10-17alpm: Do not raise SIGINT when filesize goes over limitOlivier Brunel
Variable dload_interrupted is used both to abort a download because SIGINT was caught, and when a file limit is reached. But raising SIGINT is only meant to happen in the first case. Signed-off-by: Olivier Brunel <jjk@jjacky.com>
2018-08-10libalpm/dload.c: add case for CURLE_COULDNT_RESOLVE_HOSTMichael Straube
Add a case for curl error 'Could not resolve host'. An attempt to fix FS#48285. Signed-off-by: Michael Straube <straubem@gmx.de> Signed-off-by: Allan McRae <allan@archlinux.org>
2018-06-18libalpm/dload.c: fix filename in license headerMichael Straube
The filename in the license header did not match the actual filename as in the other files. Hopefully this is not too nit-picky. Signed-off-by: Michael Straube <straubem@gmx.de> Signed-off-by: Allan McRae <allan@archlinux.org>
2018-05-14Remove all modelines from the projectEli Schwartz
Many of these are pointless (e.g. there is no need to explicitly turn on spellchecking and language dictionaries for the manpages by default). The only useful modelines are the ones enforcing the project coding standards for indentation style (and "maybe" filetype/syntax, but everything except the asciidoc manpages and makepkg.conf is already autodetected), and indent style can be applied more easily with .editorconfig Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> Signed-off-by: Allan McRae <allan@archlinux.org>
2018-03-14Update coyrights for 2018Allan McRae
make update-copyright OLD=2017 NEW=201 Signed-off-by: Allan McRae <allan@archlinux.org>
2018-01-06dload: ensure callback is always initialized onceAndrew Gregory
Frontends rely on an initialization call for setup between downloads. Checking for intialization after checking for a completed download can skip initialization in cases where files are small enough to be downloaded all at once (FS#56408). Relying on previous download size can result in multiple initializations if there are multiple non-transfer events prior to the download starting (fS#56468). Introduce a new cb_initialized variable to the payload struct and use it to ensure that the callback is initialized exactly once prior to any actual events. Fixes FS#56408, FS#56468 Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2017-01-13Introduce a 'disable-download-timeout' optionChristian Hesse
Add command line option ('--disable-download-timeout') and config file option ('DisableDownloadTimeout') to disable defaults for low speed limit and timeout on downloads. Use this if you have issues downloading files with proxy and/or security gateway. Signed-off-by: Christian Hesse <mail@eworm.de> Signed-off-by: Allan McRae <allan@archlinux.org>
2017-01-04alpm_fetch_pkgurl: fix memory leakAllan McRae
Signed-off-by: Allan McRae <allan@archlinux.org>
2017-01-04dload: s/CURLOPT_WRITEHEADER/CURLOPT_HEADERDATA/Dave Reisner
The former is really old, and should be avoided. Signed-off-by: Allan McRae <allan@archlinux.org>
2017-01-04Update copyright yearsAllan McRae
Signed-off-by: Allan McRae <allan@archlinux.org>
2016-12-05Parametrise the different ways in which the payload is resetMartin Kühne
In FS#43434, Downloads which fail and are restarted on a different server will resume and may display a negative download speed. The payload's progress in libalpm was not properly reset which ultimately caused terminal noise because the line width calculation assumes positive download speeds. This patch fixes the incomplete reset of the payload by mimicing what be_sync.c:alpm_db_update() does over in sync.c:download_single_file(). The new dload.c:_alpm_dload_payload_reset_for_retry() extends beyond the current behavior by updating initial_size and prevprogress for this case. This makes pacman reset the progress properly in the next invocation of the callback and display positive download speeds. Fixes FS#43434. Signed-off-by: Martin Kühne <mysatyre@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2016-12-05dload: use curl's keepalive mechanismDave Reisner
This does exactly the same thing as it code it replaces, but punt to curl to do it for brevity. Requires curl 7.25.0, which we already cover. Signed-off-by: Allan McRae <allan@archlinux.org>
2016-10-22Add ALPM_ERR_OK to _alpm_errno_tIvy Foster
This allows functions which return an _alpm_errno_t to always return a genuine _alpm_errno_t for consistency, even in cases where there are no errors. Since ALPM_ERR_OK = 0, their callers can still simply check 'err = some_fn(); if (!err) { ... }'. Signed-off-by: Ivy Foster <ivy.foster@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2016-08-31Replace CURLOPT_PROGRESSFUNCTION with CURLOPT_XFERINFOFUNCTIONIvy Foster
Curl 7.32.0 added CURLOPT_XFERINFOFUNCTION, which deprecates CURLOPT_PROGRESSFUNCTION and means less casting doubles to size_ts for alpm. This change has no user-facing nor frontend-facing effects. Signed-off-by: Ivy Foster <ivy.foster@gmail.com> Signed-off-by: Allan McRae <allan@archlinux.org>
2016-08-30Normalize alpm download callback's frontend cb argumentsIvy Foster
When curl calls alpm's dlcb, alpm calls the frontend's cb with the following (dlsize, totalsize) arguments: 0, -1: initialize 0, 0: no change since last call x {x>0, x<y}, y {y>0}: data downloaded, total size known x {x>0}, x: download finished If total size is not known, do not call frontend cb (no change to original behavior); alpm's callback shouldn't be called if there is a download error. See agregory's original spec here: https://wiki.archlinux.org/index.php/User:Apg#download_callback Signed-off-by: Allan McRae <allan@archlinux.org>
2016-01-04Update copyright years for 2016Allan McRae
make update-copyright OLD=2015 NEW=2016 Signed-off-by: Allan McRae <allan@archlinux.org>
2015-11-11Use correct format specifiersRikard Falkeborn
Signed-off-by: Allan McRae <allan@archlinux.org>
2015-02-01Update copyright notices for 2015Allan McRae
Signed-off-by: Allan McRae <allan@archlinux.org>
2014-12-24create_tempfile: fix memory leak on errorAllan McRae
Signed-off-by: Allan McRae <allan@archlinux.org>
2014-10-19dload: mark final_url as constChristian Hesse
Signed-off-by: Allan McRae <allan@archlinux.org>
2014-10-16dload: unlink file on filesize exceeded errorChristian Hesse
On filesize exceeded error pacman leaves a .part file in cache dir, resulting in this error on next try: error: failed to commit transaction (wrong or NULL argument passed) Errors occurred, no packages were upgraded. Unlink the file on error to avoid this.
2014-10-16dload: use better error message on exceeded file sizeChristian Hesse
Signed-off-by: Allan McRae <allan@archlinux.org>
2014-10-02alpm: Fix wrong xferred/total sizes when resuming downloadsOlivier Brunel
When a package is already partially downloaded in the cache, its download size will only be of what's left to be downloaded. Since pkg->download_size is what's used when calculating the total download size for the totaldl callback, same thing apply. However, the download progress callback was including this initial size, which would thus lead to invalid values (and percentage) used in frontends. That is, the progress bar could e.g. go further than 100% In the case of pacman, there is a sanity check for different historical reason (44a57c89), so before the possible "overflow" was noticed, the total download size/progress reported was wrong. Once caught, the TotalDownload option was ignored and it would use individual file download values as fallback instead. Signed-off-by: Olivier Brunel <jjk@jjacky.com> Signed-off-by: Allan McRae <allan@archlinux.org>