Files
BasesCGL/CLAUDE.md
T
yann64 f75cbebb44 Initial commit: GEDCOM export scripts and generated filiations
Includes export_lignees_to_gedcom.py (Drupal book → GEDCOM 5.5.1),
export_users_to_webtrees.py, generated GEDCOM files for 16 family
lineages, and webtrees user import SQL. Excludes basesgen.sql (966 MB)
and webtrees_temp_passwords.csv (sensitive).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 13:44:28 +02:00

4.3 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This repository contains basesgen.sql, a ~966 MB phpMyAdmin dump of the basesgen MySQL database for the CGL (Centres de Généalogie du Languedoc) — a genealogical research organization covering the Languedoc region of southern France. The dump was generated from MySQL 8.0 via phpMyAdmin 5.2.2 and contains marriage records from French civil registry archives.

There is no application source code in this repository; the deliverable is the SQL database itself.

Working with the SQL File

Restore the database:

mysql -u <user> -p basesgen < basesgen.sql

Search schema definitions without importing (file is 966 MB — avoid full loads):

grep -A 30 "CREATE TABLE \`<table_name>\`" basesgen.sql

Count records in a table (approximate, from INSERT lines):

grep -c "^INSERT INTO \`mariage\`" basesgen.sql

Database Schema

Core Genealogical Tables

mariage — The central table. Contains individual marriage act records from civil registry (état civil). Key columns:

  • CLE_MARIAGE — primary key (integer, sequential across batch imports)
  • JOUR_MARIAGE, MOIS_MARIAGE, ANNEE_MARIAGE — marriage date (split fields)
  • NOM_EPOUX / NOM_EPOUSE — groom/bride surnames (uppercase, latin1_general_ci)
  • PRENOM_EPOUX / PRENOM_EPOUSE — given names
  • AGE_EPOUX / AGE_EPOUSE — age at marriage
  • NOM_PERE_EPOUX, NOM_MERE_EPOUX etc. — parents of each spouse
  • VEUF_EPOUX / VEUVE_EPOUSE — widower/widow status
  • LIEU_ACTE — location of the act
  • TYPE_ACTE — type of record
  • CODE_INSEE — INSEE code linking to ville

ville — Reference table of all French towns. Columns: NOM_MAJ (uppercase name), CODE_INSEE (bigint), LATITUDE, LONGITUDE.

utilisateur — Application users (genealogy researchers). Login format: CGL[A-Z][nnn] for members, Gestionnaire for admins. Stores MD5-hashed passwords (mdp), group (1=admin, 2=member), validity date, last IP.

departements — French departments from INSEE. Includes CODE, NCC (uppercase name), CHEFLIEU (INSEE chef-lieu code), PRESENT flag (whether the department's records are in the database).

MISEAJOUR — Tracks batch imports: DATE of import, MIN/MAX range of CLE_MARIAGE values added in that batch.

cgl_34_stats_req — Search query audit log. Columns: DATE_REQ, REMOTE_ADDR, HTTP_USER_AGENT, TYPE_RECHERCHE (search type), NOM_EPOUX/NOM_EPOUSE (search terms), VARIATION_EPOUX/VARIATION_EPOUSE (phonetic variant flags), DATE_MIN/DATE_MAX (year range), DUREE_REQUETE (query duration in seconds), USERID, VILLE.

Supporting / Utility Tables

  • sauvegarde_mariage — backup copy of mariage
  • MARIAGE_CP — copy/staging table for mariage
  • testmariage — test/scratch table for mariage
  • save_ville / ville_orig — backup copies of ville
  • stat_ville — per-town statistics
  • copie_MISEAJOUR — backup of MISEAJOUR
  • menu_user — application menu entries per user role
  • mois — month name lookup table (French)
  • mariage-bad — rejected/bad records

Drupal CMS Tables (100+ tables, prefix drupal_)

The application front-end was Drupal 6. These tables manage the website (nodes, users, blocks, cache, roles, menus, etc.) and are largely independent of the genealogical data. Notable:

  • drupal_users — Drupal user accounts (separate from utilisateur)
  • drupal_content_type_ville — Drupal content nodes for towns
  • drupal_content_type_soldat — Drupal content nodes for soldier records

Character Encoding

Most genealogical tables use latin1 / latin1_general_ci. The departements table uses utf8mb3. When querying names, case-insensitive collation is already set; accent-sensitive matching may require COLLATE overrides.

Search Logic (from cgl_34_stats_req logs)

The application supported two search modes:

  • Recherche Standard — surname prefix search using ^NAME pattern (regex anchored at start)
  • Recherche Evoluée — advanced search with phonetic variant expansion (VARIATION_EPOUX/VARIATION_EPOUSE flags)

Year range filtering uses DATE_MIN/DATE_MAX against ANNEE_MARIAGE.