It doesn't matter if the solution is represented by a framework, a tool or anyting else. The problem is pretty hard to solve I'm fighting against it since years.
I'll make an example to better clarify what I'm speaking of.
File1
<head>
<title>Fotografia Elenco Completo Filtri Professionali</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<META name="Language" content="it">
<META http-equiv="Revisit-After" content="2 days">
<style>
<!--
table.MsoNormalTable
{mso-style-parent:"";
font-size:10.0pt;
font-family:"Times New Roman"}
-->
</style>
</head>
File2
<head>
<title>Militari</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="keywords" content="militari, ....">
<meta name="robots" content="INDEX, FOLLOW">
<meta name="Language" content="it">
<meta http-equiv="Revisit-After" content="2 days">
<meta name="Rating" content="General">
<link rel="stylesheet" type="text/css" href="./file/stile.css">
<script language="JavaScript">
File 3
<head>
<title>Cinema - Recensioni e Trame di Film</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<meta name="keywords" content="recensioni film">
<meta name="description" content="Ottimo sito di recensioni di film, trame di film cinematografice, di Videogame e Romanzi. ">
<meta name="robots" content="INDEX, FOLLOW">
<meta name="Language" content="it">
<meta http-equiv="Revisit-After" content="2 days">
<meta name="Rating" content="General">
<link rel="stylesheet" type="text/css" href="file/stile.css">
<style type="text/css">
body {
background-color:#F0F0F0;
text-align: center;
}
</style>
For an human being the task of avoiding this kind of code duplication is obvious. He can recognize that "", "" are delimiters. That the order of line doesn't matter and which part can be put into variables (or stored as values on a database) and also which files are similar enough to be refactored.
The whole process would seem not be so terrible hard to automatize. But.. I couldn't find any solution until now. Even automatizing the recognizing of the delimiter is hard..
The best way I found is to play with regular expression tools and become mad :D
After refactoring
file1
header -> PrintHeader();
file2
header -> PrintHeader();
file3
header -> PrintHeader();
GlobalFile
class header
{
function PrintHeader
{
SELECT title, content-type, language, revisit-after, rating, robots, extra_text_unparsed
into myArray
FROM header_table
WHERE filename = $filename
foreach(v in myArray)
{
echo ....
}
}
}
Any suggestion?