Archived document. This file has been superseded or completed and is kept for historical reference.
Nginx 404 Investigation Report
Date: 2026-01-25 Investigator: Claude Opus 4.5 Site: sempers.com Hosting Provider: SiteGround
1. Current .htaccess Contents
File: /Users/zjs/engineering/sempers.com/public_html/.htaccess
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www.sempers.com [NC]
RewriteRule ^(.*)$ https://sempers.com/$1 [L,R=301]
ErrorDocument 404 /404error.html
Analysis:
- The
.htaccessconfiguration is correct for Apache - Line 1-3: Redirects www to non-www (standard SEO practice)
- Line 5: Uses proper
ErrorDocument 404directive which should return HTTP 404 status with custom error page content
2. Root Cause: 301 Instead of 404
The Problem
When requesting a non-existent URL on sempers.com, the server returns:
HTTP/1.1 301 Moved Permanently
Server: nginx
Location: https://sempers.com/404error.html
This is followed by a 200 OK when the redirect is followed. Google interprets this as a “soft 404” because:
- The initial response is a 301 redirect, not a 404 status
- The final page (404error.html) returns 200 OK
- Google sees this as a page that exists (200) but contains error content
Root Cause Identified
The issue is NOT in the .htaccess file. The issue is at the nginx reverse proxy layer.
SiteGround uses a nginx reverse proxy in front of Apache. The request flow is:
Client Request
|
v
[ nginx ] <-- Handles 404s BEFORE reaching Apache
|
v
[ Apache ] <-- .htaccess never processes 404s for missing files
|
v
Response
Evidence from curl response:
Server: nginx- The response comes from nginx, not Apache- nginx is configured to redirect (301) to 404error.html instead of returning a true 404
The .htaccess ErrorDocument 404 directive works correctly in Apache, but nginx intercepts the request for missing files before Apache processes it.
3. Fix Required
Option A: SiteGround Support Required (Recommended)
This is NOT fixable via code changes. The nginx configuration is managed by SiteGround and cannot be modified via .htaccess or any file in the web root.
Contact SiteGround Support with this request:
Subject: Configure nginx to return proper 404 status codes
We need nginx to return HTTP 404 status codes for missing files, not 301 redirects.
Current behavior:
- Requesting a non-existent URL returns 301 Moved Permanently with redirect to /404error.html
- This causes “soft 404” errors in Google Search Console
Desired behavior:
- Nginx should return HTTP 404 status code for missing files
- The 404error.html content can be served, but with 404 status code
- Alternatively, let Apache handle 404s via .htaccess ErrorDocument directive
This is impacting our SEO as Google indexes these as soft 404s.
Option B: Workaround (Not Recommended)
Some SiteGround plans allow creating custom nginx configuration via Site Tools. However, this is not available on all plans and may not fully resolve the issue.
Option C: CDN/Edge Rule (Alternative)
If using Cloudflare or another CDN in front of SiteGround:
- Configure a Page Rule or Transform Rule to return 404 for specific URL patterns
- This adds complexity and may have caching implications
4. Summary: Code vs SiteGround Config
| Aspect | Fixable via Code | Requires SiteGround |
|---|---|---|
.htaccess ErrorDocument |
Already correct | N/A |
| nginx 301 redirect behavior | NO | YES |
| robots.txt Disallow rules | YES (completed) | N/A |
| noindex meta tags | YES (already present) | N/A |
Verdict: The 301 vs 404 issue requires SiteGround support to fix the nginx configuration. Code changes alone cannot resolve this.
5. robots.txt Deployment Status
Status: DEPLOYED (pending cache propagation)
Commit: 14d0cc3 Pushed to: origin/master Push Time: 2026-01-25
Production Deployment:
- Workflow:
deploy-production.yml - Run ID: 21346193048
- Status: COMPLETED (success)
- Duration: 55 seconds
- Completed: 2026-01-26T04:22:42Z
Changes deployed:
Sitemap: https://sempers.com/sitemap.xml
User-agent: *
Crawl-delay: 10
Disallow: /feedback/
Disallow: /assets/includes/
Disallow: /assets/guides/
Cache Status:
- The old robots.txt may still be served due to SiteGround/CDN caching
- New content should propagate within 15-30 minutes
- To verify, check
https://sempers.com/robots.txtafter cache expiration - If caching persists, purge cache via SiteGround Site Tools > Speed > Caching
6. Complete List of Files Examined
| File | Location | Purpose |
|---|---|---|
.htaccess |
/Users/zjs/engineering/sempers.com/public_html/.htaccess |
Apache server config |
robots.txt |
/Users/zjs/engineering/sempers.com/public_html/robots.txt |
Crawler directives |
DEPLOYMENT_REPORT.md |
/Users/zjs/engineering/sempers.com/business-files/DEPLOYMENT_REPORT.md |
Previous deployment notes |
STATUS_OF_INDEXING_REPAIR.md |
/Users/zjs/engineering/sempers.com/business-files/STATUS_OF_INDEXING_REPAIR.md |
Indexing fix status |
INDEXING-REPORT.md |
/Users/zjs/engineering/sempers.com/INDEXING-REPORT.md |
Detailed indexing analysis |
No nginx .conf files found in project - confirms nginx is managed server-side by SiteGround.
7. Recommended Next Steps
Immediate
- Contact SiteGround Support - Request nginx configuration change to return proper 404 status codes
- Monitor robots.txt deployment - Verify
https://sempers.com/robots.txtshows new Disallow rules
After SiteGround Fixes nginx
-
Verify 404 behavior:
curl -I https://sempers.com/this-page-does-not-exist-test # Should return: HTTP/1.1 404 Not Found -
Request re-validation in Google Search Console:
- Go to Indexing > Pages
- Click on “Soft 404” issue category
- Click “Validate Fix”
Timeline Expectations
| Item | Expected Timeline |
|---|---|
| robots.txt live on production | 5-10 minutes (after workflow completes) |
| Google re-fetches robots.txt | 24-48 hours |
| SiteGround nginx fix | Depends on support response (1-3 business days) |
| Google re-crawls soft 404 URLs | 1-4 weeks after nginx fix |
| Full resolution in Search Console | 2-4 weeks after all fixes |
Report generated: 2026-01-25 Investigator: Claude Opus 4.5