doukan5332 2013-07-02 19:51
浏览 35
已采纳

重构技巧[关闭]

I don't really have experience in factoring. My code is really long, i don't use functions because i don't know if it needs to be a function. I hope you could give me tips so i could clean up my code.

<?php

# Required files
include("simple-html-dom.php");
require("{$_SERVER['DOCUMENT_ROOT']}/config/pipeline-x.php");

# Define variables
$fn = urlencode($_REQUEST['fn']);
$ln = urlencode($_REQUEST['ln']);

# Connect to database
$db = new px_dbasei();
$db->connect("192.168.50.70", "****", "****", "piasdgeline_tesh45t");

# Query database if a record exist
$sql = "SELECT * FROM linkedin_parse "
       ."WHERE "
       ."`first_name` = '{$fn}' AND "
       ."`last_name` = '{$ln}' ";
$results = $db->query($sql);

# If there is no result
if($results->num_rows == 0):

    # Search linkedin and download page
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, "http://www.linkedin.com/pub/dir/?first={$fn}&last={$ln}&search=Search");
    curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0");
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate');
    curl_setopt($ch, CURLOPT_TIMEOUT, 8);
    $res = curl_exec($ch);
    curl_close($ch);
    $html = str_get_html($res);

    # Parse records from the download page
    foreach($html->find('li.vcard') as $vcard):
            $table = array();  

        foreach($vcard->find('span.given-name') as $given_name):                    
            $table['first_name'] = (trim(addslashes($given_name->plaintext), " "));
        endforeach; 
        foreach($vcard->find('span.family-name') as $family_name):
            $table['last_name'] = (trim(addslashes($family_name->plaintext)," "));
        endforeach;
        foreach($vcard->find('span.location') as $location):
            $table['location'] = (trim(addslashes($location->plaintext), " "));
        endforeach;
        foreach($vcard->find('span.industry') as $industry):
            $table['industry'] = (trim(addslashes($industry->plaintext), " "));
        endforeach;
        foreach($vcard->find('dd.current-content') as $headline):
            $table['headline'] = (trim(addslashes($headline->plaintext), " "));
        endforeach;
        foreach($vcard->find('a.btn-primary') as $url):
            $table['url'] = addslashes($url->href);
        endforeach;

        # Insert generated results to the database
        $sql = "INSERT INTO linkedin_parse (`first_name`,`last_name`,`location`,`industry`,`headline`,`url`) "
              ."VALUES "
              ."('{$table['first_name']}',"
              ."'{$table['last_name']}',"
              ."'{$table['location']}',"
              ."'{$table['industry']}',"
              ."'{$table['headline']}',"
              ."'{$table['url']}')";
        $db->query($sql);

        # Get last insert id and query database again
        $new_id = $db->insert_id();
        $sql2 = "SELECT * FROM linkedin_parse WHERE `linkedin_parse_id` = '{$new_id}'";
        $result = $db->query($sql2);

        # Display results in HTML
        ?>
        <ol>
            <?php while($row = $result->fetch_assoc()): ?>
                <li class="vcard">
                    <span class="given-name"><?php echo $row['first_name'] ?></span>
                    <span class="family-name"><?php echo $row['last_name'] ?></span>
                    <span class="location"><?php echo $row['location'] ?></span>
                    <span class="industry"><?php echo $row['industry'] ?></span>
                    <dd class="current-content">
                        <span><?php echo $row['headline'] ?></span>
                    </dd>
                    <a href="<?php echo $row['url'] ?>"></a>
                </li>
            <?php endwhile; ?>
        </ol>
        <?php
    endforeach;
else:
    # Query database if record is 30 days old
    $sql = "SELECT * FROM linkedin_parse "
          ."WHERE "
          ."`first_name` = '{$fn}' AND"
          ."`last_name` = '{$ln}' AND"
          ."`date_inserted` >= DATE_SUB(NOW(), INTERVAL 30 DAY)";
    $results = $db->query($sql);

    if($results->num_rows != 0):
        # Retrieve from database
        $sql = "SELECT * FROM linkedin_parse "
          ."WHERE "
          ."`first_name` = '{$fn}' AND"
          ."`last_name` = '{$ln}' ";
        $result = $db->query($sql);

        # Display results in HTML
        ?>
        <ol>
            <?php while($row = $result->fetch_assoc()): ?>
                <li class="vcard">
                    <span class="given-name"><?php echo $row['first_name'] ?></span>
                    <span class="family-name"><?php echo $row['last_name'] ?></span>
                    <span class="location"><?php echo $row['location'] ?></span>
                    <span class="industry"><?php echo $row['industry'] ?></span>
                    <dd class="current-content">
                        <span><?php echo $row['headline'] ?></span>
                    </dd>
                    <a href="<?php echo $row['url'] ?>"></a>
                </li>
            <?php endwhile; ?>
        </ol>
        <?php
    else:
        # Search linked-in for updated records
            $ch = curl_init();
            curl_setopt($ch, CURLOPT_URL, "http://www.linkedin.com/pub/dir/?first={$fn}&last={$ln}&search=Search");
            curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0");
            curl_setopt($ch, CURLOPT_HEADER, 0);
            curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
            curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate');
            curl_setopt($ch, CURLOPT_TIMEOUT, 8);
            $res = curl_exec($ch);
            curl_close($ch);
            $html = str_get_html($res);

            # Parse records from the download page
            foreach($html->find('li.vcard') as $vcard):
                $table = array();  

                foreach($vcard->find('span.given-name') as $given_name):                    
                    $table['first_name'] = (trim(addslashes($given_name->plaintext), " "));
                endforeach; 
                foreach($vcard->find('span.family-name') as $family_name):
                    $table['last_name'] = (trim(addslashes($family_name->plaintext)," "));
                endforeach;
                foreach($vcard->find('span.location') as $location):
                    $table['location'] = (trim(addslashes($location->plaintext), " "));
                endforeach;
                foreach($vcard->find('span.industry') as $industry):
                    $table['industry'] = (trim(addslashes($industry->plaintext), " "));
                endforeach;
                foreach($vcard->find('dd.current-content') as $headline):
                    $table['headline'] = (trim(addslashes($headline->plaintext), " "));
                endforeach;
                foreach($vcard->find('a.btn-primary') as $url):
                    $table['url'] = addslashes($url->href);
                endforeach;

                # Update records
                $sql = "UPDATE linkedin_parse "
                      ."SET "
                      ."`date_inserted` = now(),"
                      ."`first_name` = '{$table['first_name']}',"
                      ."`last_name` = '{$table['last_name']}', "
                      ."`location` = '{$table['location']}', "
                      ."`industry` = '{$table['industry']}', "
                      ."`headline` = '{$table['headline']}', "
                      ."`url` = '{$table['url']}' "
                      ."WHERE "
                      ."`first_name` = '{$table['first_name']}' AND"
                      ."`last_name` = '{$table['last_name']}' AND "
                      ."`location` = '{$table['location']}' ";
                $result = $db->query($sql);
                ?>
                <ol>
                    <?php while($row = $result->fetch_assoc()): ?>
                        <li class="vcard">
                            <span class="given-name"><?php echo $row['given-name'] ?></span>
                            <span class="family-name"><?php echo $row['family-name'] ?></span>
                            <span class="location"><?php echo $row['location'] ?></span>
                            <span class="industry"><?php echo $row['industry'] ?></span>
                            <dd class="current-content">
                                <span><?php echo $row['headline'] ?></span>
                            </dd>
                            <a href="<?php echo $row['url'] ?>"></a>
                        </li>
                    <?php endwhile; ?>
                </ol>
                <?php

            endforeach;
    endif;
endif;
  • 写回答

1条回答 默认 最新

  • douangzhao4108 2013-07-02 20:47
    关注

    As a general concept, I would recommend a few things:

    • "DRY" or "Don't Repeat Yourself" is a good concept to start with, as others have mentioned. If you do something more than once, chances are that it deserves its own function or can be simplified.
    • While typically applied to object-oriented programming, the "single responsibility principle" can be well applied to functions for their maintainability, but you should also avoid creating functions just because you can (function calls require overhead). Eventually, collections of functions often end up in reusable classes.
    • "KISS" or "Keep it simple, stupid" (note: this is better looked at as a self-referencing "stupid", like when someone says, "I'm such an idiot!" when they figure something out) -- Simplify your logic and code whenever you can. "The simplest explanation is usually the correct one."

    To apply these concepts, here is how I would re-structure (not how I would write) your script:

    1. Query whether any profiles match that are less that 30 days old, since you are updating the profiles when none exist or all are more than 30 days old.
    2. If you didn't return any rows:
      • Query whether any profiles match at all, to determine updates vs inserts.
      • Download and parse the page.
      • Save the new/updated records.
      • Keep the matches you parsed, rather than downloading from the database what you just inserted/updated.
    3. Finally, display your matches (regardless of whether they came from database or parsing).

    By restructuring your code this way:

    • You eliminate most of the duplication and potential confusion
    • You simplify your codepath
    • You group your logical components (search, parse/store if necessary, display)
    • All of your HTML can be placed at the end of your script, which is more readable (or can be easily placed in a separate file)
    • The places where a function will make sense will be more apparent, such as your parsing loops.

    Last note -- you should place an emphasis on security whenever you process data from an "unknown source" (user, website, provided file, etc.). While addslashes() and urlencode() are a nice idea, there are a number of resources that can help you understand how to avoid SQL Injection, cross-site scripting and other potential threats. An example of a risk in your code is the use of $_REQUEST without escaping your database query.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 win11 23H2删除推荐的项目,支持注册表等
  • ¥15 matlab 用yalmip搭建模型,cplex求解,线性化处理的方法
  • ¥15 qt6.6.3 基于百度云的语音识别 不会改
  • ¥15 关于#目标检测#的问题:大概就是类似后台自动检测某下架商品的库存,在他监测到该商品上架并且可以购买的瞬间点击立即购买下单
  • ¥15 神经网络怎么把隐含层变量融合到损失函数中?
  • ¥15 lingo18勾选global solver求解使用的算法
  • ¥15 全部备份安卓app数据包括密码,可以复制到另一手机上运行
  • ¥20 测距传感器数据手册i2c
  • ¥15 RPA正常跑,cmd输入cookies跑不出来
  • ¥15 求帮我调试一下freefem代码