Total words: 1112 | 2-word phrases: 286 | 3-word phrases: 298 | 4-word phrases: 303
Title | Try to keep the title under 60 characters (48 characters) Common Crawl - Open Repository of Web Crawl Data |
Description | Try to keep the meta description between 50 - 160 characters (0 characters) |
Keywords | Meta keywords are not recommended anymore (0 characters) |
H1 | H1 tag on the page (92 characters) Common Crawl maintains a free, open repository of web crawl data that can be used by anyone. |
# | Keyword | H1 | Title | Des | Volume | Position | Suggest | Frequency | Density |
---|---|---|---|---|---|---|---|---|---|
1 | a | 12 | 5.33% | ||||||
2 | of | 12 | 5.33% | ||||||
3 | the | 7 | 3.11% | ||||||
4 | crawl | 7 | 3.11% | ||||||
5 | is | 7 | 3.11% | ||||||
6 | inside | 6 | 2.67% | ||||||
7 | web | 6 | 2.67% | ||||||
8 | block | 6 | 2.67% | ||||||
9 | this | 6 | 2.67% | ||||||
10 | div | 6 | 2.67% |
# | URL | Whois | Check |
---|---|---|---|
1 | https://commoncrawl.github.io/cc-crawl-statistics/ | Whois | github.io |
2 | https://groups.google.com/g/common-crawl | Whois | google.com |
3 | https://huggingface.co/commoncrawl | Whois | huggingface.co |
4 | https://discord.gg/njaVFh7avF | Whois | discord.gg |
5 | https://arxiv.org/pdf/2404.10006 | Whois | arxiv.org |
6 | https://dl.acm.org/doi/10.1145/3589334.3645510 | Whois | acm.org |
7 | https://arxiv.org/abs/2402.03300 | Whois | arxiv.org |
8 | https://arxiv.org/abs/2206.15147 | Whois | arxiv.org |
9 | https://munin.uit.no/handle/10037/28861 | Whois | uit.no |
10 | https://doi.org/10.1016/j.jksuci.2023.01.004 | Whois | doi.org |
11 | https://scholar.google.com/scholar?q=common+crawl | Whois | google.com |
12 | https://github.com/commoncrawl/cc-citations/ | Whois | github.com |
13 | https://commoncrawl.github.io/cc-crawl-statistics/ | Whois | github.io |
14 | https://groups.google.com/g/common-crawl | Whois | google.com |
15 | https://huggingface.co/commoncrawl | Whois | huggingface.co |
16 | https://discord.gg/njaVFh7avF | Whois | discord.gg |
17 | https://x.com/commoncrawl | Whois | x.com |
18 | https://www.linkedin.com/company/common-crawl/ | Whois | linkedin.com |
19 | https://discord.gg/njaVFh7avF | Whois | discord.gg |