url_interface#

ioapps.url_interface.get_contents(url, fail_nonread=False, fail_schema=False, timeout=10)[source]#

DESCRIPTION:

This function attempts to collect the contents of a URL path url specified upon entry.

Parameters:
url: str

A Python string specifying the URL path contents to be collected.

Keywords:
fail_nonread: bool, optional

A Python boolean valued variable specifying whether to fail when a URL path is non-readable and/or does not contain readable contents.

fail_schema: bool, optional

A Python boolean valued variable specifying whether to fail if a MissingSchema exception is raised by the requests package.

timeout: int, optional

A Python integer value specifying the duration period for which to allow the URL request to be valid.

Returns:
data: Union[str, None]

A Python string containing the contents of the URL path url specified upon entry; if the contents are unable to be collected, and the keyword parameter arguments are specified accordingly, NoneType is returned.

Raises:
URLInterfaceError:
  • raised if the URL path is non-readable and fail_nonread is True upon entry.

  • raised if the schema for the URL path could not be determined and fail_schema is True upon entry.

Return type:

Optional[str]

ioapps.url_interface.get_weblist(url, ext=None, include_dirname=False)[source]#

DESCRIPTION:

This function builds a list of files beneath the specified URL file path.

Parameters:
url: str

A Python string specifying the path to the internet (world-wide web; WWW) file to be retrieved.

Keywords:
ext: str, optional

A Python string specifying the web filename extension; if NoneType on entry the value defaults to to an empty string.

include_dirname: bool, optional

A Python boolean valued variable specifying whether to append the URL path directory name to the retrieved file names; if False upon entry, the retrieved files will simply be the basename for the respective retrieved file names.

Returns:
weblist: List

A Python list containing the files beneath the specified URL.

Raises:
URLInterfaceError:
  • raised if an Exception is encountered while attempting to parse the URL path contents; the respective error message accompanys the message string passed to the URLError class.

Return type:

List

ioapps.url_interface.read_webfile(url, ignore_missing=False, split=None, return_string=False)[source]#

DESCRIPTION:

This function collects the contents of a specified URL path and returns a Python list containing the respective contents.

Parameters:
url: str

A Python string specifying the path to the internet (world-wide web; WWW) file to be retrieved.

Keywords:
ignore_missing: bool, optional

A Python boolean valued variable specifying whether to ignore URL path requests that raise urllib.error.HTTPError; if True upon entry the returned list (see below) will be an empty list.

split: str, optional

A Python string specifying the string/characters to be used to split the contents of the respective file.

return_string: bool, optional

A Python boolean valued variable specifying whether to return the contents of the URL path as a string; if False upon entry, the default format of the file (typically bytes) will be returned.

Returns:
contents: List

A Python list containing the contents of the specified URL path.

Raises:
URLInterfaceError:
  • raised if an exception is encountered while establishing the URL path request.

  • raised if the opening the specified URL path fails due to a missing endpoint; raised only if ignore_missing is False upon entry.

  • raised if an exception is encountered while parsing the contents of the URL file path specified upon entry.

Return type:

List