问题遇到的现象和发生背景
学习爬取网址时,用post请求https://api.opensea.io/graphql/
同样的headers,同样的data。用http.client.HTTPSConnection()请求正常,而用requset.post()就会被反爬(我下面就称之为报错)
问题相关代码
别人正常的代码
import http.client
conn = http.client.HTTPSConnection("api.opensea.io")
payload = "{\n \"id\": \"AssetSearchQuery\",\n \"query\": \"query AssetSearchQuery(\\n $categories: [CollectionSlug!]\\n $chains: [ChainScalar!]\\n $collection: CollectionSlug\\n $collectionQuery: String\\n $collectionSortBy: CollectionSort\\n $collections: [CollectionSlug!]\\n $count: Int\\n $cursor: String\\n $identity: IdentityInputType\\n $includeHiddenCollections: Boolean\\n $numericTraits: [TraitRangeType!]\\n $paymentAssets: [PaymentAssetSymbol!]\\n $priceFilter: PriceFilterType\\n $query: String\\n $resultModel: SearchResultModel\\n $showContextMenu: Boolean = false\\n $shouldShowQuantity: Boolean = false\\n $sortAscending: Boolean\\n $sortBy: SearchSortBy\\n $stringTraits: [TraitInputType!]\\n $toggles: [SearchToggle!]\\n $creator: IdentityInputType\\n $assetOwner: IdentityInputType\\n $isPrivate: Boolean\\n $safelistRequestStatuses: [SafelistRequestStatus!]\\n) {\\n query {\\n ...AssetSearch_data_2hBjZ1\\n }\\n}\\n\\nfragment AssetCardAnnotations_assetBundle on AssetBundleType {\\n assetCount\\n}\\n\\nfragment AssetCardAnnotations_asset_3Aax2O on AssetType {\\n assetContract {\\n chain\\n id\\n }\\n decimals\\n ownedQuantity(identity: $identity) @include(if: $shouldShowQuantity)\\n relayId\\n favoritesCount\\n isDelisted\\n isFavorite\\n isFrozen\\n hasUnlockableContent\\n ...AssetCardBuyNow_data\\n orderData {\\n bestAsk {\\n orderType\\n relayId\\n maker {\\n address\\n }\\n }\\n }\\n ...AssetContextMenu_data_3z4lq0 @include(if: $showContextMenu)\\n}\\n\\nfragment AssetCardBuyNow_data on AssetType {\\n tokenId\\n relayId\\n assetContract {\\n address\\n chain\\n id\\n }\\n collection {\\n slug\\n id\\n }\\n orderData {\\n bestAsk {\\n relayId\\n }\\n }\\n}\\n\\nfragment AssetCardContent_asset on AssetType {\\n relayId\\n name\\n ...AssetMedia_asset\\n assetContract {\\n address\\n chain\\n openseaVersion\\n id\\n }\\n tokenId\\n collection {\\n slug\\n id\\n }\\n isDelisted\\n}\\n\\nfragment AssetCardContent_assetBundle on AssetBundleType {\\n assetQuantities(first: 18) {\\n edges {\\n node {\\n asset {\\n relayId\\n ...AssetMedia_asset\\n id\\n }\\n id\\n }\\n }\\n }\\n}\\n\\nfragment AssetCardFooter_assetBundle on AssetBundleType {\\n ...AssetCardAnnotations_assetBundle\\n name\\n assetCount\\n assetQuantities(first: 18) {\\n edges {\\n node {\\n asset {\\n collection {\\n name\\n relayId\\n slug\\n isVerified\\n ...collection_url\\n id\\n }\\n id\\n }\\n id\\n }\\n }\\n }\\n assetEventData {\\n lastSale {\\n unitPriceQuantity {\\n ...AssetQuantity_data\\n id\\n }\\n }\\n }\\n orderData {\\n bestBid {\\n orderType\\n paymentAssetQuantity {\\n ...AssetQuantity_data\\n id\\n }\\n }\\n bestAsk {\\n closedAt\\n orderType\\n dutchAuctionFinalPrice\\n openedAt\\n priceFnEndedAt\\n quantity\\n decimals\\n paymentAssetQuantity {\\n quantity\\n ...AssetQuantity_data\\n id\\n }\\n }\\n }\\n}\\n\\nfragment AssetCardFooter_asset_3Aax2O on AssetType {\\n ...AssetCardAnnotations_asset_3Aax2O\\n name\\n tokenId\\n collection {\\n slug\\n name\\n isVerified\\n ...collection_url\\n id\\n }\\n isDelisted\\n assetContract {\\n address\\n chain\\n openseaVersion\\n id\\n }\\n assetEventData {\\n lastSale {\\n unitPriceQuantity {\\n ...AssetQuantity_data\\n id\\n }\\n }\\n }\\n orderData {\\n bestBid {\\n orderType\\n paymentAssetQuantity {\\n ...AssetQuantity_data\\n id\\n }\\n }\\n bestAsk {\\n closedAt\\n orderType\\n dutchAuctionFinalPrice\\n openedAt\\n priceFnEndedAt\\n quantity\\n decimals\\n paymentAssetQuantity {\\n quantity\\n ...AssetQuantity_data\\n id\\n }\\n }\\n }\\n}\\n\\nfragment AssetContextMenu_data_3z4lq0 on AssetType {\\n ...asset_edit_url\\n ...asset_url\\n ...itemEvents_data\\n relayId\\n isDelisted\\n isEditable {\\n value\\n reason\\n }\\n isListable\\n ownership(identity: {}) {\\n isPrivate\\n quantity\\n }\\n creator {\\n address\\n id\\n }\\n collection {\\n isAuthorizedEditor\\n id\\n }\\n imageUrl\\n ownedQuantity(identity: {})\\n}\\n\\nfragment AssetMedia_asset on AssetType {\\n animationUrl\\n backgroundColor\\n collection {\\n displayData {\\n cardDisplayStyle\\n }\\n id\\n }\\n isDelisted\\n imageUrl\\n displayImageUrl\\n}\\n\\nfragment AssetQuantity_data on AssetQuantityType {\\n asset {\\n ...Price_data\\n id\\n }\\n quantity\\n}\\n\\nfragment AssetSearchFilter_data_3KTzFc on Query {\\n ...CollectionFilter_data_2qccfC\\n collection(collection: $collection) {\\n numericTraits {\\n key\\n value {\\n max\\n min\\n }\\n ...NumericTraitFilter_data\\n }\\n stringTraits {\\n key\\n ...StringTraitFilter_data\\n }\\n id\\n }\\n ...PaymentFilter_data_2YoIWt\\n}\\n\\nfragment AssetSearchList_data_3Aax2O on SearchResultType {\\n asset {\\n assetContract {\\n address\\n chain\\n id\\n }\\n collection {\\n isVerified\\n relayId\\n id\\n }\\n relayId\\n tokenId\\n ...AssetSelectionItem_data\\n ...asset_url\\n id\\n }\\n assetBundle {\\n relayId\\n id\\n }\\n ...Asset_data_3Aax2O\\n}\\n\\nfragment AssetSearch_data_2hBjZ1 on Query {\\n ...AssetSearchFilter_data_3KTzFc\\n ...SearchPills_data_2Kg4Sq\\n search(after: $cursor, chains: $chains, categories: $categories, collections: $collections, first: $count, identity: $identity, numericTraits: $numericTraits, paymentAssets: $paymentAssets, priceFilter: $priceFilter, querystring: $query, resultType: $resultModel, sortAscending: $sortAscending, sortBy: $sortBy, stringTraits: $stringTraits, toggles: $toggles, creator: $creator, isPrivate: $isPrivate, safelistRequestStatuses: $safelistRequestStatuses) {\\n edges {\\n node {\\n ...AssetSearchList_data_3Aax2O\\n __typename\\n }\\n cursor\\n }\\n totalCount\\n pageInfo {\\n endCursor\\n hasNextPage\\n }\\n }\\n}\\n\\nfragment AssetSelectionItem_data on AssetType {\\n backgroundColor\\n collection {\\n displayData {\\n cardDisplayStyle\\n }\\n imageUrl\\n id\\n }\\n imageUrl\\n name\\n relayId\\n}\\n\\nfragment Asset_data_3Aax2O on SearchResultType {\\n asset {\\n relayId\\n isDelisted\\n ...AssetCardContent_asset\\n ...AssetCardFooter_asset_3Aax2O\\n ...AssetMedia_asset\\n ...asset_url\\n ...itemEvents_data\\n orderData {\\n bestAsk {\\n paymentAssetQuantity {\\n quantityInEth\\n id\\n }\\n }\\n }\\n id\\n }\\n assetBundle {\\n relayId\\n ...bundle_url\\n ...AssetCardContent_assetBundle\\n ...AssetCardFooter_assetBundle\\n orderData {\\n bestAsk {\\n paymentAssetQuantity {\\n quantityInEth\\n id\\n }\\n }\\n }\\n id\\n }\\n}\\n\\nfragment CollectionFilter_data_2qccfC on Query {\\n selectedCollections: collections(first: 25, collections: $collections, includeHidden: true) {\\n edges {\\n node {\\n assetCount\\n imageUrl\\n name\\n slug\\n isVerified\\n id\\n }\\n }\\n }\\n collections(assetOwner: $assetOwner, assetCreator: $creator, onlyPrivateAssets: $isPrivate, chains: $chains, first: 100, includeHidden: $includeHiddenCollections, parents: $categories, query: $collectionQuery, sortBy: $collectionSortBy) {\\n edges {\\n node {\\n assetCount\\n imageUrl\\n name\\n slug\\n isVerified\\n id\\n __typename\\n }\\n cursor\\n }\\n pageInfo {\\n endCursor\\n hasNextPage\\n }\\n }\\n}\\n\\nfragment CollectionModalContent_data on CollectionType {\\n description\\n imageUrl\\n name\\n slug\\n}\\n\\nfragment NumericTraitFilter_data on NumericTraitTypePair {\\n key\\n value {\\n max\\n min\\n }\\n}\\n\\nfragment PaymentFilter_data_2YoIWt on Query {\\n paymentAssets(first: 10) {\\n edges {\\n node {\\n symbol\\n relayId\\n id\\n __typename\\n }\\n cursor\\n }\\n pageInfo {\\n endCursor\\n hasNextPage\\n }\\n }\\n PaymentFilter_collection: collection(collection: $collection) {\\n paymentAssets {\\n symbol\\n relayId\\n id\\n }\\n id\\n }\\n}\\n\\nfragment Price_data on AssetType {\\n decimals\\n imageUrl\\n symbol\\n usdSpotPrice\\n assetContract {\\n blockExplorerLink\\n chain\\n id\\n }\\n}\\n\\nfragment SearchPills_data_2Kg4Sq on Query {\\n selectedCollections: collections(first: 25, collections: $collections, includeHidden: true) {\\n edges {\\n node {\\n imageUrl\\n name\\n slug\\n ...CollectionModalContent_data\\n id\\n }\\n }\\n }\\n}\\n\\nfragment StringTraitFilter_data on StringTraitType {\\n counts {\\n count\\n value\\n }\\n key\\n}\\n\\nfragment asset_edit_url on AssetType {\\n assetContract {\\n address\\n chain\\n id\\n }\\n tokenId\\n collection {\\n slug\\n id\\n }\\n}\\n\\nfragment asset_url on AssetType {\\n assetContract {\\n address\\n chain\\n id\\n }\\n tokenId\\n}\\n\\nfragment bundle_url on AssetBundleType {\\n slug\\n}\\n\\nfragment collection_url on CollectionType {\\n slug\\n}\\n\\nfragment itemEvents_data on AssetType {\\n assetContract {\\n address\\n chain\\n id\\n }\\n tokenId\\n}\\n\",\n \"variables\": {\n \"categories\": null,\n \"chains\": null,\n \"collection\": \"doodles-official\",\n \"collectionQuery\": null,\n \"collectionSortBy\": null,\n \"collections\": [\n \"doodles-official\"\n ],\n \"count\": 32,\n \"cursor\": \"" + "QXNzZXRUeXBlOjc0NDE1OTQ1" + "\",\n \"identity\": null,\n \"includeHiddenCollections\": null,\n \"numericTraits\": null,\n \"paymentAssets\": null,\n \"priceFilter\": null,\n \"query\": null,\n \"resultModel\": \"ASSETS\",\n \"showContextMenu\": true,\n \"shouldShowQuantity\": false,\n \"sortAscending\": true,\n \"sortBy\": \"PRICE\",\n \"stringTraits\": null,\n \"toggles\": null,\n \"creator\": null,\n \"assetOwner\": null,\n \"isPrivate\": null,\n \"safelistRequestStatuses\": null\n }\n}"
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36 Edg/96.0.1054.62",
"Referer": "https://opensea.io/",
"Accept": "*/*",
"Connection": "Keep-Alive",
'x-api-key': '2f6f419a083c46de9d83ce3dbe7db601',
'x-build-id': '2prqoQtl5Cs0-W3jgjHRB',
'x-signed-query': 'ac90cef79e9898a9e0d2dea265a5355bd39307f6ecd89239bdc6a5a11c408bc8',
'content-type': 'application/json'
}
str = conn.request("POST", "/graphql/?", payload, headers)
res = conn.getresponse()
a = res.read()
print(a)
自己写报错的代码
import requests
url = 'https://api.opensea.io/graphql/'
//payload和heraders与上面代码一样,字数限制就不写了
payload = {}
headers = {}
a= requests.get(url, data=payload, headers=headers)
print(a)
报错的内容
b'<!DOCTYPE html>\n<!--[if lt IE 7]> <html class="no-js ie6 oldie" lang="en-US"> <![endif]-->\n<!--[if IE 7]> <html class="no-js ie7 oldie" lang="en-US"> <![endif]-->\n<!--[if IE 8]> <html class="no-js ie8 oldie" lang="en-US"> <![endif]-->\n<!--[if gt IE 8]><!--> <html class="no-js" lang="en-US"> <!--<![endif]-->\n<head>\n<title>Access denied | api.opensea.io used Cloudflare to restrict access</title>\n<meta charset="UTF-8" />\n<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />\n<meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" />\n<meta name="robots" content="noindex, nofollow" />\n<meta name="viewport" content="width=device-width,initial-scale=1" />\n<link rel="stylesheet" id="cf_styles-css" href="/cdn-cgi/styles/main.css" type="text/css" media="screen,projection" />\n\n\n<script defer src="https://api.radar.cloudflare.com/beacon.js"></script>\n</head>\n<body>\n <div id="cf-wrapper">\n <div class="cf-alert cf-alert-error cf-cookie-error hidden" id="cookie-alert" data-translate="enable_cookies">Please enable cookies.</div>\n <div id="cf-error-details" class="p-0">\n <header class="mx-auto pt-10 lg:pt-6 lg:px-8 w-240 lg:w-full mb-15 antialiased">\n <h1 class="inline-block md:block mr-2 md:mb-2 font-light text-60 md:text-3xl text-black-dark leading-tight">\n <span data-translate="error">Error</span>\n <span>1020</span>\n </h1>\n <span class="inline-block md:block heading-ray-id font-mono text-15 lg:text-sm lg:leading-relaxed">Ray ID: 6c371e3a8d2a1c7f •</span>\n <span class="inline-block md:block heading-ray-id font-mono text-15 lg:text-sm lg:leading-relaxed">2021-12-26 03:10:47 UTC</span>\n <h2 class="text-gray-600 leading-1.3 text-3xl lg:text-2xl font-light">Access denied</h2>\n </header>\n\n <section class="w-240 lg:w-full mx-auto mb-8 lg:px-8">\n <div id="what-happened-section" class="w-1/2 md:w-full">\n <h2 class="text-3xl leading-tight font-normal mb-4 text-black-dark antialiased" data-translate="what_happened">What happened?</h2>\n <p>This website is using a security service to protect itself from online attacks.</p>\n \n </div>\n\n \n </section>\n\n <div class="cf-error-footer cf-wrapper w-240 lg:w-full py-10 sm:py-4 sm:px-8 mx-auto text-center sm:text-left border-solid border-0 border-t border-gray-300">\n <p class="text-13">\n <span class="cf-footer-item sm:block sm:mb-1">Cloudflare Ray ID: <strong class="font-semibold">6c371e3a8d2a1c7f</strong></span>\n <span class="cf-footer-separator sm:hidden">•</span>\n <span class="cf-footer-item sm:block sm:mb-1"><span>Your IP</span>: 223.95.6.72</span>\n <span class="cf-footer-separator sm:hidden">•</span>\n <span class="cf-footer-item sm:block sm:mb-1"><span>Performance & security by</span> <a rel="noopener noreferrer" href="https://www.cloudflare.com/5xx-error-landing" id="brand_link" target="_blank">Cloudflare</a></span>\n \n </p>\n</div><!-- /.error-footer -->\n\n\n </div><!-- /#cf-error-details -->\n </div><!-- /#cf-wrapper -->\n\n <script type="text/javascript">\n window._cf_translation = {};\n \n \n</script>\n\n<script defer src="https://static.cloudflareinsights.com/beacon.min.js/v652eace1692a40cfa3763df669d7439c1639079717194" integrity="sha512-Gi7xpJR8tSkrpF7aordPZQlW2DLtzUlZcumS8dMQjwDHEnw9I7ZLyiOj/6tZStRBGtGgN6ceN6cMH8z7etPGlw==" data-cf-beacon=\'{"rayId":"6c371e3a8d2a1c7f","token":"96047490de144c8b91be74ba7605ab69","version":"2021.12.0","si":100}\' crossorigin="anonymous"></script>\n</body>\n</html>\n'
我想要达到的结果
希望师父们能解答为什么直接用request会被检测出来为爬虫,如何正确修改request。
万分感谢帮忙解答!