Name: grunt-fetch-pages
Owner: AOE
Description: Grunt plugin for fetching URLs and saving the result as local files
Created: 2014-12-18 09:17:08.0
Updated: 2017-06-12 10:51:26.0
Pushed: 2015-01-05 14:46:25.0
Size: 168
Language: JavaScript
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Grunt plugin for fetching URLs and saving the result as local files.
This plugin requires Grunt ~0.4.2
If you haven't used Grunt before, be sure to check out the Getting Started guide, as it explains how to create a Gruntfile as well as install and use Grunt plugins. Once you're familiar with that process, you may install this plugin with this command:
install grunt-fetch-pages --save-dev
Once the plugin has been installed, it may be enabled inside your Gruntfile with this line of JavaScript:
t.loadNpmTasks('grunt-fetch-pages');
In your project's Gruntfile, add a section named fetchpages
to the data object passed into grunt.initConfig()
.
t.initConfig({
tchpages: {
options: {
// Task-specific options go here.
},
your_target: {
// Target-specific file lists and/or options go here.
},
Type: String
Base url for fetching remote pages via GruntJS “files” feature. Can be omitted when using only the urls
feature (see urls
option).
Type: String
Required: yes
Local destination folder for fetched remote urls. This option is mandatory.
Type: Array
Default: []
An optional list of remote urls to fetch. Required properties per element:
url
: full remote URL to fetchlocalFile
: local file name for fetched page (destination folder is defined by destinationFolder
option)Type: Boolean
Default: true
Also fetch sub pages referenced via links (<a href="">
). No fetching of links within sub pages at this time.
Type: String
Default: [rel="nofollow"]
Selector for ignoring certain links when following (see followLinks
option). The default value matches links with the “rel” attribute set to “nofollow”: <a href="" rel="nofollow">
.
The selector is applied as $('a:not(ignoreSelector)')
, e.g. $('a:not([rel="nofollow"])')
Type: Boolean
Default: false
Clean fetched pages via htmlclean node module, removing unneeded whitespaces, line-breaks, comments, etc.
Type: Boolean
Default: true
Do not fetch the baseURL
when this option is set to false
.
Simple example, fetch base URL and follow links:
t.initConfig({
tchpages: {
dist: {
options: {
baseURL: 'http://localhost:3000',
destinationFolder: 'test/www-fetched',
}
}
Full example with all feasible options set:
t.initConfig({
tchpages: {
dist: {
options: {
baseURL: 'http://localhost:3000',
destinationFolder: 'test/www-fetched',
urls: [
{url: 'http://localhost:3003/url.html', localFile: 'url.html'}
],
followLinks: true,
ignoreSelector: '[rel="nofollow"]',
cleanHTML: false,
fetchBaseURL: true
},
files: [
{
src: ['**/*.html', '!url.html'],
expand: true,
cwd: 'test/www-root/'
}
]
}
Take care to maintain the existing coding style. Add unit tests for any new or changed functionality. Do not submit code that did not pass the default grunt task for linting and testing.
Credits
Thanks to SinnerSchrader for support.
License
The MIT License (MIT)
Copyright (c) 2013-2014 Oliver Hellebusch
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.