php - preg_replace to change drupal style links to standard a tags


I have content that I am migrating from a drupal site. I'm not certain if this is a standard drupal 'feature' but all the a tags have been converted to this format:[link label](link url)

I am attempting to convert these back to normal links with preg_replace with limited success. This is my current attempt:

$pattern = '/\[*\](.*?)/si';
$str = 'aving Accounts [Top Yields]( - Share ';
$replacement = '<h1>';
echo preg_replace($pattern, $replacement, $subject, -1 );

(please ignore the h1 tag replacement - obviously this isn't what I would use to create a link tag.

this gives the result:

'aving Accounts [Top Yields<h1>( - Share'

I feel like I should be using a different function in order to preserve the content between the brackets but a search though the php manual / google / stackoverflow has thrown up nothing useful.

edit (full example string):

'Welcome to the latest Edition of our Weekly Bulletin. Here we aim to provide our Subscribers and Clients with useful news, opinion and analysis in relation to macro-trends affecting the investment performance of real-assets and the general economy, ultimately facilitating well-informed investment decisions. In this bulletin we seek to provide you with a comparison of the three best money-market income investments, and one real-asset income investment. DGC Asset Management is a market-leading boutique providing Investors and Financial Advisors with access to market-leading Research Reports, and opportunities to invest in prime, productive property assets with a track record of generating annual yields exceeding 15%. By creating a credible point of reference, DGC aims to facilitate well-informed investment decisions amongst our Clients and Subscribers seeking to optimise and diversify investment portfolios with real-asset alternatives. ##A Comparison of the Best Income Investments in 2012## As most readers will be painfully aware, risk-free returns i.e. returns on insured cash deposits, remain ridiculously low as central banks continue to supress interest rates in an effort to stimulate economic growth (or offset economic contraction). In this Weekly Bulletin we analyse a range of traditional [income investments](/income-investments) and one DGC alternative, comparing the real-return (adjusted for inflation) and capital risk to a theoretical investment of £100,000 over 3 years This basic analysis is designed to offer Investors and Advisors with a simple comparison of the best of the best in terms of income investments. We have made a number of assumptions in our calculations, and all sources are referenced at the end of this page. This information should not be construed as financial advice or investment advice, and one should of course consult a Financial Advisor in order to ascertain the individual suitability of any given investment. ###Assumptions### **Capital invested:** £100,000 **Term:** 3 years **Tax rate:** 40% **Inflation:** 3.5% ###Best Savings Account### **Product:** Close Brothers 3-Year Fixed Rate Account **Gross Annual Rate:** 4% **After Tax:** 2.4% **After Tax and Inflation:** -1.06% **Real value after three years:** £96,845.36 **Real profit after 3 years:** -£3,154.64 **Risk**:- Up to £85,000 is insured under the FSCS, and Close Brothers are pretty stable, so risk to capital is minimal as you would rightly expect from an investment that generated a 'real' loss. ###Best Share Dividends### **Product:** Man Group Dividend **Gross Annual Rate:** 19.28% **After Tax:** 11.57% **After Tax and Inflation:** 7.8% **Real value after three years:** £125,262.55 **Real profit after 3 years:** £25,262.55 **Risk**:- As with any listed equity, Investors are exposed to the day to day vagaries of financial markets, and to the price performance of the underlying company. As we are measuring performance over a three year period, we should consider the volatility in the share price for the past three years, during which time Man Group's share price has fallen from 229.25 to 83.00, a variance of 63.8%. This simple analysis shows us that Investors who bought the stock three years ago have lost over half of their capital. ###Best Corporate Bond### **Product:** Enterprise Inns PLC Corporate Bond **Gross Annual Rate:** 10.45% **After Tax:** 6.27% **After Tax and Inflation:** 3.1% **Real value after three years:** £108,245.78 **Real profit after 3 years:** £8,245.78 **Risk**:- Just as with equities, the value of bonds will rise and fall based on the perceived stability of the issuing company. Again, we should consider the volatility in the share price for the past three years, during which time Enterprise Inns' share price has fallen from 137.75 to a low of 26.5 at the beginning of this year; a variance of 80.8%. Whilst this analysis provides some insight into company stability, we should really look at fluctuations in the traded value of the bond, which was originally issued with a coupon of 6.5%, and has fallen in value by around 50% since issue before rebounding somewhat. This level of volatility presents a significant risk to capital. Now, for good measure, we will also throw in our own 3-Year Exit Strategy Investment, based on the acquisition and short-term, leverage assisted disposal of heavily discounted high-yield property assets. ###Exit Strategy Investment### **Product:** Asset Backed 3-Year Property Investment **Gross Annual Rate:** 15% + 100% capital uplift after 3 years **After Tax:** 9% + 60% capital uplift after 3 years **After Tax and Inflation:** 5.5% + 54.12% capital uplift after 3 years **Real value after three years:** £170,920.76 **Real profit after 3 years:** £70,920.76 **Risk**:- Invested capital is secured directly against real-estate with an appraised (and disposal) value of circa. 200% of invested capital. Throughout the term of the investment (3 years), capital is always either secured against real-estate in this way, or held liquid in escrow, so the risk to capital is extremely low. However, Investors must be in a position to tolerate the potential illiquidity associated with property assets, although in this case properties are acquired and disposed of for a significant profit within a 30 day period, therefore we consider the risk profile of this product to be low to medium. ##Net Cash Profits After 3 Years Adjusted for Inflation## [](/user/login) ###Methodology### a) Gross Annual Rate: As advertised b) After Tax: Advertised rate less 40% c) After Tax and Inflation: After tax rate less 3.5% d) Real value after three years: ((100000 x B x B x B)/1.035/1.035/1.035) e) Real profit after 3 years: d - 100000 ###References### [This is Money]( - Saving Accounts [Top Yields]( - Share Dividends [Investors Chronicle]( - Corporate Bonds'

(apologies for verbose nature of the example - I feel like obmitting anything may impact the results)

As you may see, I am currently getting the final url set as the url of the first link





Does do what you want?

   $pattern = '/\[([^]]*)\]\(([^\)]*)\)/si';
   $subject = <<<EOF
Text text text [LINK_1_TEXT](JUNK_LINK_1) - Share 
Text text text [LINK_2_TEXT](JUNK_LINK_2)

   $replacement = '<a href="\2">\1</a>';

   echo "TEXT IS:\n" . $subject . "\n";
   echo "\nRESULT:\n";
   echo preg_replace($pattern, $replacement, $subject, -1 );

It outputs the following:

Text text text [LINK_1_TEXT](JUNK_LINK_1) - Share 
Text text text [LINK_2_TEXT](JUNK_LINK_2)

Text text text <a href="JUNK_LINK_1">LINK_1_TEXT</a> - Share 
Text text text <a href="JUNK_LINK_2">LINK_2_TEXT</a>

The regular expression breaks down as follows...'/\[([^]]*)\]\(([^\)]*)\)/si'; breaks into the following bits...

  1. \[([^]]*)\] The first part matches the squared brackets. You had this pretty much correct. The brackets are correctly escaped so that they are not interpreted as a range. The only change is to use[^]]* rather than*. The original would search for zero or more[ characters, the change searches for a literal '[' followed by zero or more chars (that are not a closing bracket - otherwise you get agreedy match*) followed by a literal ']'.

    The curved brackets surrounding the.* save whatever the.* matches as a group - called a - that is later refered to in the replacement string as '\1'. This is called a .

  2. \(([^\)]*)\) This matches a set of curved brackets containing text.([^\)]*) creates the second group and is referred to in the replacement pattern as\2.[^\)] matches any character that is not a). If you use.* you'll get a greedy match.

Greedy matches

If I use the string(This Text) blah [blah](blah) some more text

\(.*\) matches(This Text) blah [blah](blah) because a greedy match is made... it tries to match as much of the string as possible.

If you use\([^\)]\) then, using the above example string, it will match the pattern in the string twice, matchingThis Text and(blah) separately.




How about:

$pattern = '/\[(.*?)\]\((.*?)\)/';
$replacement = '<a href="\\2">\\1</a>';

People are also looking for solutions to the problem: php - How to allow 'http' only in the Access Rules array


Didn't find the answer?

Our community is visited by hundreds of web development professionals every day. Ask your question and get a quick answer for free.

Ask a Question

Write quick answer

Do you know the answer to this question? Write a quick response to it. With your help, we will make our community stronger.

Similar questions

Find the answer in similar questions on our website.